1  I 

AD~A090  130  MASSACHUSETTS  INST  Of  TECH  CAMBRI06E  LAB  FOR  COMPUTE— ET< 

SOME  NEW  METHODS  OF  MUSIC  SYNTHESIS.  (U) 

AUG  80  W  G  PASEMAN 

UNCLASSIFIED  MlT/LCS/TM-172 

F/6  9/2 

NL 

m 

'1 

— 

1 

— I 

a. 

■^1 

■M 1 

1 _ z _ 

AD  A090130 


LtVtL  © 


LABORATORY  FOR 
COMPUTER  SCIENCE 


MASSACHUSETTS 
INSTITUTE  OF 
TECHNOLOGY 


MIT/LCS/TM-1  7  2 


SCME  NEW  METHODS  OF  MUSIC  SYNTHESIS 


William  Gerhard  Paseman 


PTIC 

LLECTE 


9  1980 


August  1980 


§ 


This  research  was  supported  by  the  Advanced  Research 
Projects  Agency  of  the  Department  of  Defense  and  was 
monitored  by  the  Office  of  Naval  Research  under 
Contract  No.  N00014-75-C-0661 


545  TECHNOLOGY  SQUARE,  CAMBRIDGE,  MASSACHUSETTS  02139 


80  1 


SECURITY  CLASSIFICATION  OF  THIS  PAGE  (Whit  Data  Bntarad) 

f—  REPORT  DOCUMENTATION  PACE 


'/ ^  J 


t  f-  '  :  : 


I.  REPORT  NUMBER 

M1T/LCS/IM-17  2 

4.  TITLE  (and  Submit) 


O ire  READ  INSTRUCTIONS 

_ BEFORE  COMPLETING  FORM 

[2.  QOVT  ACCESSION  NO.  3.  RECIPIENT'S  CATALOG  NUMBER 


kD-JrO?Ol 


Sane  New  Methods  of  Music  Synthesis  e  ^ 


17.  AUTHOR^; 


/William  GcyPaseman 


S.  TYPE  OF  REPORT  A  PERIOD  COVERED 

M.S. Thesis  -  August  1980 

£  PERFORMING  QRO.  REPORT  NUMBER 

IL/  r  tlT/ICS/m-HT]  ‘-' 

^  Sr  CONTRACT  SSgwffT  NUMBERf»; 


Jti  j  N0$pl4-75-CpM61  j 


fts-lil'z 


9.  PERFORMING  ORGANIZATION  NAME  ANO  ADDRESS  10.  PROGRAM  ELEMENT.  PROJECT.  TASK 

MIT/Laboratory  for  Computer  Science  area^ajjork  wulhumm "*  , 

545  Technology  Square  r  -f  j  7  / 

Cambridge,  MA  02139  d. — -  X  1  I 

1  L  CONTROLLING  OFFICE  NAME  ANO  ADDRESS  n«n*RT-PATg  ~ 

ARPA/Department  of  Defense  {  *1  J  J  Augut  d.98^ 

1400  Wilson  Boulevard  'is.  number  of  pages 

Arlington,  VA  22209  113 

14.  MONITORING  AGENCY  NAME  A  ADDRESS (II  dltlarant  Iron  Controlling  Otllca)  IS.  SECURITY  CLASS,  (ol  thla  r apart) 

ONR/Department  of  the  Navy 

Information  Systems  Program  Unclassified 

Arlington,  VA  22217  is.,  declassification/downgrai 


'  .  J  *  '  MTB" 

J  j  ,  Augutg-198^ 


IS.  NUMBER  OF  PAGES 

113 


15*.  DECLASSIFICATION/ DOWNGRADING 
SCHEDULE 


I  16.  DISTRIBUTION  STATEMENT  (of  thle  Report) 


This  document  has  been  approved  for  public  release  and  sale? 
its  distribution  is  unlimited 


17.  DISTRIBUTION  STATEMENT  (of  the  ebetract  entered  In  Block  20,  II  different  from  Report) 


18.  SUPPLEMENTARY  NOTES 


[19  KEY  WORDS  (Contln 


nd  Identity  by  block  number ) 


Artificial  Intelligence 
Music  Composition 

Real  Time  ' 

Music  Synthesis 

20  ABSTRACT  (Continue  on  revet ee  aide  It  neceeemry  and  Identify  by  block  number) 

There  jirc  two  distinct  sections  to  this  thesis. 

The  first  section  discusses  music  composition,  shows  why  it  is  a  useful  domain  for  Artificial 
Intelligence  research  and  presents  a  set  of  "Design  Rules"  that  facilitate  research  in  die  field  of  tonal  music 
composition. 

It  begins  with  a  short  chapter  presenting  a  subset  of  music  theory.  This  chapter  assumes  no  prior 
knowledge  of  the  subject  ■  oinplctcly  defines  all  terms  used  in  the  thesis,  and  is  geared  particularly  toward 
those  unfamiliar  with  music,  those  unwilling  to  learn  standard  music  notation  and  those  interested  in 


00  1  1473  COITION  OF  I  NOV  <1  IS  OBSOLETE 


tfcU'ix < 


SECURITY  CLASSIFICATION  OF  THIS  PAOE  (*ban  Data  E ntatad) 


mcmwty  clawhcatiow  or  THI»  FMKWw  fcHMI 

20. 
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Next,  (using  the  terms  defined  in  the  thesis),  a  context  sensitive  generative  grammar  for  producing 
pitch  progressions  in  the  major  inode  is  introduced.  It  is  seen  that  the  grammar  can  be  made  context  free  by 
switching  between  two  interpretations  of  the  input  siring.  A  mechanism  for  switching  from  one  interpretation 
to  another  when  parsing  sentences  generated  from  this  grammar  is  described.  It  is  shown  that  a  model  of 
music  composition,  perception  and  improvisation  fits  within  die  framework  of  the  grammar.  This  multiple 
view  model  and  switching  mechanism  can  be  interpreted  as  a  primitive  "frame". 

The  second  section  describes  some  of  die  problems  and  issues  encountered  while  designing  the  initial 
hardware  for  the  Music  Aided  Cognition  Project  at  M.l.T.  All  of  die  developed  hardware  permits  computer 
control,  performance  and  recording  of  music  in  real  lime. 

flic  first  chapter  in  this  section  discusses  a  machine  called  die  Inexpensive  Synthcsi/cr/Recordcr.  It 
capable  of  synthesizing  14  square  wave  voices,  each  voice  having  a  range  of  7  octaves,  with  each  octave  having 
12  bits  of  ficqucncy  control.  Its  purpose  is  to  allow  die  user  to  record  key  depression  times,  key  release  times 
and  key  impact  vclticilics  when  playing  a  keyboard  piece.  Its  primary  constraint  was  low  cost,  allowing  many 
copies  to  be  made.  Its  microprocessor  interface  allows  it  to  be  easily  controlled  by  many  different  means, 
including  home  computers.  The  complete  schematics  for  the  synthesizer  and  the  controller  arc  provided  as  an 
appendix. 

The  next  chaplet  discusses  an  oscillator  which  synthesizes  sound  using  32  sine  or  8  TM  wavefonns. 
flic  machine  can  be  easily  expanded  to  produce  256  sine  voices  and  64  (or  more)  I'M  voices.  All  sine 
waveforms  in  both  types  of  synthesis  arc  weighted  with  two  independent  coefficients.  Microprograimnable 
firmware  allows  one  to  produce  sound  by  a  limited  number  of  methods  other  than  sine  summation  or  l-'M 
synthesis. 
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Abstract 

There  arc  two  distinct  sections  to  this  thesis. 

The  first  section  discusses  music  composition,  shows  why  it  is  a  useful  domain  for  Artificial 
Intelligence  research  and  presents  a  set  of  "Design  Rules"  that  facilitate  research  in  the  field  of  tonal  music 
composition. 

It  begins  with  a  short  chapter  presenting  a  subset  of  music  theory.  This  chapter  assumes  no  prior 
knowledge  of  the  subject,  it  completely  defines  all  terms  used  in  the  thesis,  and  is  geared  particularly  toward 
those  unfamiliar  with  music,  those  unwilling  to  learn  standard  music  notation  and  those  interested  in 
Artificial  Intelligence  research. 

Nest,  (using  the  terms  defined  in  the  thesis),  a  context  sensitive  generative  grammar  for  producing 
pitch  progressions  in  the  major  mode  is  introduced.  It  is  seen  that  the  grammar  can  be  made  context  lice  by 
switching  between  two  interpretations  of  the  input  string.  A  mechanism  for  sw  itching  from  one  interpretation 
to  another  when  parsing  sentences  generated  from  this  grammar  is  described.  It  is  shown  that  a  model  of 
music  composition,  perception  and  improvisation  Ills  within  the  framework  ol  the  grammar,  I  his  multiple 
view  model  and  switching  mechanism  can  be  interpreted  as  a  primitive  "frame". 

‘  The  second  section  describes  some  of  the  problems  and  issues  encountered  while  designing  the  initial 
hardware  for  the  Music  Aided  Cognition  Project  at  M  l.  I  .  All  of  the  developed  hardware  permits  computer 
control,  performance  and  recording  of  music  in  real  time.  ■ 

The  first  chapter  in  this  section  discusses  a  machine  called  the  Inexpensive  Synthesi/er/Rccorder.  It 
capable  of  synthesizing  14  square  wave  voices,  each  voice  having  a  range  of  7  rxtaves,  with  each  octave  having 
12  bits  of  frequency  control.  Its  purpose  is  to  allow  the  user  to  record  key  depression  times,  key  release  times 
and  key  impact  velocities  when  playing  a  keyboard  piece.  Its  primary  constraint  was  low  cost,  allowing  many 
copies  to  be  made.  Its  microprocessor  interface  allows  it  to  be  easily  controlled  by  many  different  means, 
including  home  compuleis.  I  lie  complete  schematics  for  the  synthesizer  and  the  controller  arc  provided  as  an 
appendix. 

The  next  chapter  discusses  .in  oscillator  which  synthesizes  sound  using  42  sine  or  8  I  'M  waveforms. 
The  machine  can  he  easily  expanded  to  produce  25t>  sme  voices  and  64  (or  more)  I'M  voices.  All  sine 
waveforms  in  both  types  of  synthesis  are  weighted  with  two  independent  coefficients  Microprogrammablc 
firmware  allows  one  to  produce  sound  by  a  limited  number  ol  methods  other  than  sine  summation  or  I'M 
synlbesis. 
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1.  C  hapter  One:  Introduction 

1.1  Motivation  for  Artificial  Intelligence  Research  in  Music 

this  work's  basic  goal  is  to  show  that  A.I.  research  can  be  furthered  through  the  study  of  music. 

Why  should  A.I.  investigators  care  about  studying  music?  One  reason  lies  in  the  area  of  knowledge 
representation  research. 

The  problem  of  knowledge  representation  plays  a  key  role  in  most  A.I.  questions.  Those  who  have 
studied  the  problems  of  knowledge  representation  generally  focus  their  attention  on  one  of  several  broad 
areas  of  application.  A  prominent  area  of  application  is  "Real  World"  know  ledge  representation. 

However,  the  world  is  a  big  place  full  of  diverse  livings  used  in  diverse  ways.  In  developing  a  formal 
system  of  representation  for  everything  in  it,  one  quickly  runs  into  big  problems.  There  are  usually  one  of 
three  reactions  to  these  problems,  live  first  is  to  "patch"  the  system.  This  means  that  the  "well  defined" 
system  is  not  as  well  defined  as  it  was.  live  second  is  to  say—  "Well,  no  system  can  respond  to  every  problem, 
but  this  one  responds  to  most".  These  two  reactions  lead  to  a  third  approach,  one  of  using  multiple 
representation  systems,  each  viewing  the  world  from  a  different  perspective.  In  all  cases,  the  system  developer 
essentially  admits  that  the  problems  arc  very  difficult  to  get  one's  arms  around  and  that  the  domain  is  too 
complex  for  one  system  to  handle. 

Therefore,  it  has  proved  worthwhile  to  search  for  problem  domains  smaller  than  the  real  world  in 
which  to  do  A.I.  research.  Many  workers  have  done  dvis  by  studying  problems  arising  in  a  less  complex 
subset  of  live  real  world.  Indeed,  the  A.I.  literature  abounds  with  "blocks  World"  problems.  This  is  a  valid 
approach,  but  it  is  apparently  quite  hard  to  leave  die  "blocks  World"  once  you  have  entered  it.  That  is.  die 
way  one  extends  a  developed  system  back  to  the  "Real  World"  is  not  apparent.  One  reason  for  this  is  that 
there  is  no  real  metric  to  determine  what  subsets  of  real  world  representation  problems  follow  one  another.  If 
real  world  representation  problems  were  ordered  historically,  then  A.I.  would  be  easier,  but  we  have  had  the 
same  way  of  viewing  things  for  many  millennia.  The  recorded  philosophy  of  our  views  shows  development. 
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but  the  views  themselves  have  no  available  record  of  development  (that  we  can  find). 

However  there  is  simpler,  but  not  trivial,  problem  domain  having  a  relatively  long  recorded  history  of 
development.  It  is  music  composition. 

In  this  field,  the  problems  arc  well  ordered  (historically)  in  terms  of  difficulty,  well  documented,  and 
relatively  complete.  (Problems  in  music  composition  arc  partially  ordered  in  time,  if  you  w  ill).  Therefore,  in 
doing  A. I. -Music  research,  one  can  start  at  music  composition’s  primitive  beginnings  and  work  forward  in 
lime,  developing  theories  about  how  music  was  structured,  how  it  was  composed  and  how  it  was  perceived. 
As  naive  theories  are  proposed  and  discarded  to  make  wav  for  newer  (and  hopefully  less  naive)  theories,  one 
should  get  a  better  and  better  idea  of  the  correct  way  to  model  die  problems  involved  (and  more  importantly, 
the  wrong  way  to  model  the  problems). 

This  is  the  basic  idea.  One  approach  to  it  would  he  to  apply  some  of  the  simpler  A.i.  paradigms  at 
the  beginning  of  the  music  composition  "time  line"  and  increase  the  complexity  of  the  applied  paradigms 
until  the  analysis-synthesis  problems  for  the  studied  works  at  that  time  were  "solved".  One  then  would 
increment  "time"  and  repeat  the  procedure  as  far  up  die  time  line  as  possible. 

Note  that  this  idea  can  be  used  building  models  which  are  "derivatives"  of  microworlds.  Instead  of 
constructing  a  series  of  microworlds,  each  representing  an  element  of  some  time  progression  in  music  history, 
one  constructs  some  absolute  model  initially  and  specifies  each  successive  model  as  being  the  old  model,  plus 
a  set  of  differences. 

Although  things  such  as  "cultural  factors"  would  be  an  important  difficulty  here.  Hi  is  method  could 
potentially  yield  quick  returns  to  the  A.I.  community.  Simple  bugs  existing  in  standard  paradigms  might  be 
more  quickly  uncovered  and  clarified  by  using  the  "simpler"  problem  space. 

As  the  avowed  purpose  of  this  project  is  to  help  solve  "Real  World"  A.I.  issues,  several  immediate, 
and  valid,  objections  to  this  choice  of  domain  can  be  raised.  The  first  is  "Perhaps  the  path  from  the  blocks 
World’  to  the  Real  World'  is  difficult,  bill  the  path  from  the  Music  World'  to  the  'Real  World'  is  probably 
non-existent.”  This  may  he  true,  but  it  misses  part  of  the  point.  If  A.I.  researchers  are  able  to  completely 
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work  out  tin.  problems  in  any  domain,  the  insights  gained  from  the  experience  will  help  them  in  any  other 
domain  that  they  care  to  attack.  The  area  of  music  is  simple  enough  and  well  documented  enough  to  allow 
A.l.  researchers  to  achieve  considerable  success  in  a  relatively  short  period  of  time. 

Conceivably,  there  is  another  benefit  that  can  arise  from  this  choice  of  knowledge  representation 
domain.  That  is  in  A.l.  system  testing.  One  idea  of  an  appropriate  test  for  an  integrated  Knowledge 
Representation  and  Control  system  would  be  something  like  a  l  uring  lest.  For  example,  in  the  systems 
proposed  here,  one  would  ask  the  system  to  compose  a  piece  of  music  in  the  style  of  some  period,  artist  or 
group.  In  a  linguistic  knowledge  representation  system,  this  would  be  roughly  equivalent  to  asking  the 
computer  to  say  something  reasonably  intelligent  about  a  particular  topic  in  the  way  that  a  given  person 
would.  This  idea  is  admittedly  a  little  naive,  however  such  tests  administered  to  music  generation  programs 
show  promise  of  being  simple  without  being  simplistic. 

This  idea  has  implications  in  the  way  a  music  system  is  built.  Since  any  given  person  "enjoys"  or 
"interprets"  music  differently  than  any  other  given  person,  having  a  machine  that  composes  "good"  music 
can  not  be  the  goal  here.  Rather,  the  goal  is  to  produce  a  machine  that  can  produce  music  in  die  style  of 
certain  composers.  Therefore,  a  typical  computer  routine  would  be  called  "bach  Counterpoint"  as  opposed  to 
"Counterpoint". 

This  illustrates  yet  another  benefit.  Master  composers  developed  methods  of  directing  the  interest 
(and  sometimes  even  the  emotions)  of  their  audiences  simply  by  controlling  the  sounds  that  they  heard. 
Records  of  how  they  developed  particular  works  before  they  were  publicly  performed  is  often  available  to 
researchers.  The  results  of  such  "autograph"  studies  were  used  to  formulate  many  of  the  ideas  reported  here. 

If  the  idea  of  spontaneous  generation  of  classical  music  by  computer  programs  fails,  it  will  still  have 
shown  something  if  it  produces  a  system  by  which  layman  to  the  musical  world  are  able  to  generate  pieces  of 
music  relatively  easily  and  these  works  are  similar  to  the  works  of  masters.  Still,  to  show  something,  the 
model  of  music  composition  developed  must  be  intuitively  or  instinctively  understandable.  It  won’t  do  to 
have  the  work  of  the  layman  be  limited  to  operations  like  selecting  probability  kernels  or  note  occurrence 
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matriccs. 

In  summary,  the  problems  encountered  in  musical  analysis  and  synthesis  are  identical  to  many  of  the 
important  problems  encountered  by  cognitive  scientists  in  all  fields.  Music  differs  from  most  such  fields  in 
that  the  material  studied  has  a  well  recorded  history  of  development.  Intuitively,  this  recorded  history  seems 
to  order  the  problems  that  a  researcher  can  attack.  I  bis  intuition  alone  provides  strong  motivation  for 
research  in  the  field  if  the  proper  tools  are  available.  In  this  thesis.  I  prov  ide  some  of  the  necessary  tools. 

1.2  How  to  Lure  the  Right  People  into  Doing  Music  Research 

I'nticing  A.I.  researchers  into  the  study  of  music  is  difficult.  Most  of  the  people  that  should  be 
working  on  the  problems  in  this  field  "know  nothing  about  music"  and  won’t  even  consider  studying  it. 
Therefore,  one  goal  of  this  work  is  to  provide  tools  which  make  the  study  of  music  more  attractive  and 
tractable  to  them.  The  question  is  then,  what  tools  should  be  made  available. 

A  similar  problem  has  been  worked  on  recently  by  Mead  and  Conway 1 1]. 

Mead  and  Conway  are  interested  in  the  design  of  very  large  scale  integrated  (VI  SI)  circuits. 
I  radilionally,  w  ork  in  this  field  required  a  know  ledge  of  semiconductor  devices,  which  required  a  know  ledge 
of  solid  state  physics,  which  in  turn  required  a  knowledge  of  calculus  and  classical  physics. 

Mead  and  Conway  virtually  eliminated  the  need  for  all  of  this  background  knowledge  by 
compressing  it  into  a  simple  two  page  set  of  design  rules.  In  addition,  they  facilitated  the  actual  chip  design 
process  by  introducing  a  simple  language  that  completely  specifies  all  the  parameters  needed  by  the 
companies  who  fabricate  the  VI  SI  device.  I  earning  these  rules  and  using  various  interfaces  to  this  language 
allowed  specialists  in  the  fields  of  algorithms,  digital  logic,  topological  complexity  and  systems  architecture  to 
directly  participate  in  solving  VLSI  related  problems  (as  well  as  problems  in  their  own  field).  In  short,  by 
removing  the  stilling  detail.  Mead  and  Conway  induced  a  lot  of  valuable  talent  to  participate  m  the  field. 

Some  may  say  that  the  design  mles  are  not  a  "compression"  of  the  information  needed  to  design 
chips,  but  rather  a  gross  oversimplification  of  key  design  concepts.  I  his  may  be  true,  but  tl  lias  not  hindered 
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people  ignorant  of  device  physics  from  advancing  the  state  of  the  ail  in  the  VLSI  (Tld.  Ihis  result  is 
important  and  it  is  why  their  approach  should  be  transferred  to  the  A. I. -music  field. 

This  thesis  can  he  viewed  as  a  progress  report  on  the  attempt  to  duplicate  the  pedagogical  approach 
(and  its  consequences)  of  Mead  and  Conway.  The  function  of  the  hardware  section  is  to  describe  progress  on 
the  music  project  equivalent  of  Mead  and  Conway's  layout  language. 

I  he  music  theory  section  attempts  to  parallel  Mead  and  Conway's  design  rule  idea  In  specifying 
several  sets  Decomposition  rules  for  the  various  (extremely)  limited  musical  examples. 

I  he  basic  model  of  "music"  central  to  this  work  is  shown  in  the  figure  labeled  "A  Model  of  Music". 
I  his  figure  is  a  block  diagram  illustrating  the  transformations  betw  een  four  different  representations  of  music. 
Ihe  right  hand  side  of  the  diagram  is  concerned  with  some  aspects  of  music  synthesis  and  the  left  hand  side 
equivalently  with  music  analysis.  In  an  ideal  system  we  would  extract  or  add  parameters  such  .is 
"Performance  Practice"  and  "Sound  Objects"  as  we  transform  the  music  from  one  representation  into 
another.  I  iscr  interaction  would  occur  through  editors  at  each  level. 

The  delineation  of  music  shown  in  the  figure  is  central  both  to  the  hardware  development  in  this 
project  and  to  this  wink  s  ideas  about  music  composition,  representation  and  cognition. 

This  diagram  explicitly  indicates  an  interest  in  both  music  analysis  and  synthesis.  It  would  certainly 
be  easier  to  simply  indicate  an  interest  in  analysis.  Those  interested  in  hav  ing  the  computer  compose  music 
are  strongly  opposed  by  many  different  factions  for  a  variety  of  reasons.  However,  attempting  to  do  both 
analysis  and  synthesis  results  in  the  discovery  of  problems  (and  the  solutions  to  problems)  in  music  analysis 
that  would  not  arise  as  quickly  from  a  study  of  music  analysis  alone. 

As  an  aside,  the  ideas  represented  in  Ihe  diagram  are  admittedly  a  product  of  the  times.  Hie  concept 


of  using  procedures  to  map  data  from  one  representation  to  another  wouldn't  have  occured  if  computer 
science  hadn’t  provided  the  parallel.  I  he  tendency  ol  the  scientific  community  to  use  modeling  mechanisms 
that  are  currently  "fashionable"  is  annoying.  Past  examples  of  "fad"  modelling  mechanisms  arc  fluid 
mechanical  models  of  the  human  nervous  system  and  models  of  high  level  thought  processes  based  on  simple 


control  loops.  Now,  modeling  thought  using  computer  science  concepts  is  in  fashion.  Unfotunately,  this 
paper  can  only  follow  suit. 

To  test  the  theories  and  results  produced  at  any  level  in  die  diagram,  support  at  die  other  levels  must 
be  provided.  This  support  must  lie  in  both  die  software  and  hardware  domains.  The  majority  of  the  initial 
hardware  support  should  occur  at  die  Performance  Schedule  and  Acoustical  Signal  levels.  One  approach  to 
die  hardware  support  configuration  would  be  to  associate  a  system  Inis  with  each  of  these  levels. 

The  figure  labeled  "The  Music  Project  litis  Scheme"  shows  one  such  bus  scheme.  The  "Notated 
Score"  and  "Score  Kernel"  section  of  die  "Music  Model”  figure  is  embodied  in  l  isp  programs  in  a  l  ist 
Processor,  die  "Performance  Schedule"  is  centered  around  die  Micro-bus  and  Nu-bus,  the  "Acoustical 
Signal"  appears  on  the  S-bus.  The  appropriate  place  to  start  work  oil  this  or  any  similar  system  is  with  the 
Inexpensive  Synthesizer-Recorder  and/or  the  Digital  Signal  Processor.  The  hardware  section  of  this  diesis 
discusses  these  two  aspects  of  the  system. 

1.3  I  lard  ware  Goals  and  Constraints 

I  he  most  stringent  constraint  placed  on  all  our  music  producing  systems  is  that  they  produce  music 
in  real  time.  There  arc  several  reasons  for  having  real  time  (as  opposed  to  non-real  time)  music  synthesis. 
One  is  that  real  time  synthesis  will  allow  users  to  compose  music  in  an  interactive  environment.  This  means 
that  they  can  compose  more  quickly  and  with  less  frustration  dian  non-real  time  systems  allow.  Anodicr 
reason  is  that  the  si/c  difference  between  the  data  representations  at  die  Performance  Schedule  and  die 
Acoustical  Signal  levels  is  large.  If  the  "Synthesis"  link  joining  them  is  fast,  not  as  much  data  storage  on  the 
Acoustical  Signal  level  will  be  required  as  would  be  otherwise. 

Since  each  of  our  Computer  Music  producing  systems  will  be  centered  about  a  computer  system  of 
some  sort,  it  is  important  to  have  an  idea  of  the  current  relationship  between  hardware,  software  and  price  in 
electronic  sound  synthesis.  One  way  to  do  this  is  to  order  die  currently  available  sound  synthesis  features  by 


desirability. 
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Onc  possible  list  is  shown  below. 

h.isic  square  wave  monophonic  scale  generation  in  any  desired  octave 

polyphony 

envelope  shaping 

wave  shaping 

multiple  tonalities 

frequency  modulation  and  other  non-linear  techniques 

In  producing  any  of  the  above  list’s  features  in  the  Inexpensive  Synthesizer  or  the  Digital  Signal 
Processor,  it  is  desirable  to  let  the  relatively  cheap  processing  power  of  r  .roprocessors  take  live  brunt  of  the 
signal  processing  load.  This  means  that  the  final  instrument  will  be  software  as  opposed  to  hardware 
intensive. 

Slow  processing  speed  is  the  biggest  problem  encountered  in  using  "code"  to  generate  the  above 
features  with  current  microprocessor  instruction  cycle  times.  This  is  where  a  suitable  compromise  between 
the  price,  software  and  hardware  aspects  of  the  design  is  important.  It  means  that  microprocessors  must  be 
used  more  for  control  than  pure  synthesis  purposes. 


The  inexpensive  systems  only  possess  the  first  few  items  on  this  list  since  their  primary  purpose  is  to 
act  as  cheap  scratchpad  devices.  Also,  the  final  inexpensive  synthesizer  is  relatively  applications  independent, 
that  is,  little  interfacing  is  required  to  attach  the  machine  to  either  a  computer  or  any  other  control  device 
(such  .is  a  keyboard).  This  was  an  important  constraint  in  the  final  instrument. 


1.4  A  Microprocessor  Applications  Learning  Example 


flic  circuitry  developed  and  built  for  this  project  over  the  past  9  months  has  made  heavy  use  of 
microprocessors.  Microprocessors  limited  the  scope  of  the  project  in  many  ways.  One  signal  processing 


algorithm  that  shows  how  badly  microprocessors  can  perform  is  shown  below. 

High  Resolution  Digital  to  Analog  converters  arc  an  expensive  piece  of  equipment.  Perhaps  they  can 
be  done  away  with  entirely  by  using  a  microprocessor  implementation  of  an  analog  modulation  technique 
called  Pulse  Duration  Modulation  (PDM).  Hie  advantage  of  using  PDM  for  encoding  is  that  one  can  recover 
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|  the  original  signal  by  simply  feeding  the  encoding  through  a  low  pass  filter.  Tins  method  would  eliminate  the 

need  for  a  l)AC  and  a  track  on  the  Util  end  of  a  I  )igilnl  Synthesizer. 

PDM  is  certainly  a  viable  encoding  mechanism  m  the  analog  domain.  It  v.orks  by  encoding  die 
analog  signal's  instantaneous  amplitude  in  die  pulse  width  of  each  sample  pulse.  One  algorithm  for  this 
method  of  encoding  is  shown  in  the  figure  labeled  "I’ulse  Duration  Modulation".  Here,  a  sawtooth  is  added 
to  the  sample  value  and  the  result  is  compared  to  a  threshold  value.  If  the  result  is  above  the  threshold,  the 
serial  output  sends  a  "I".  If  the  value  is  below  the  threshold,  the  serial  output  sends  a  "0". 

Replacing  the  l)AC  using  a  simple  microprocessor  algorithm  based  on  the  figure  seems  reasonable. 
The  algorithm  first  inputs  the  value  to  be  converted.  Next,  it  outputs  a  "I"  and  initializes  a  comparison 
counter  (which  can  lie  assumed  to  be  the  length  of  die  value  to  he  converted)  to  all  "0"’s.  It  increments  the 
counter  until  it  matches  the  input  value  and  then  drops  the  output  to  "0".  finally,  it  waits  until  the  sample 
period  ends  and  stmts  over  again. 

I  cl's  calculate  what  type  of  instruction  cycle  time  is  needed  to  execute  tins  algorithm.  We  will  choose 
a  sampling  frequency  of  10  Khz.  and  let  the  amplitude  take  on  256  discrete  values.  I  his  is  certainly  a  minimal 
system.  Hie  synthcsizable  waveforms  arc  bandlimited  to  signals  having  a  frequency  content  of  less  than  5 .1)00 
liz.  and  the  signal  to  noise  ratio  is  only  52  dll.  Still,  with  a  worst  case  input  (consisting  of  the  minimum 
amplitude),  this  means  that  we  need  to  perform  the  algorithm  at  a  frequency  of  (HI  Kliz)(25(>)  -  2.56  Mhz! 
This  is  about  the  frequency  at  which  the  clock  of  most  microprocessors  operate.  Thetefore.  the  idea  is  clearly 
unsuitable  for  microprocessor  implementation.  Why  is  this  so?  There  are  two  reasons 

A  microprocessor  has  an  un.icceplably  high  overhead  when  it  has  to  execute  a  triv  ial  subroutine.  In 
the  above  example,  the  trivial  subroutine  is  in  the  inner  loop  of  the  algorithm,  so  the  overhead  occurs  all  the 
time.  A  I  )A(’  with  a  I’l  )M  output  would  have  its  ramp  and  compare  functions  implemented  in  hardware. 

I -ven  more  impoilatilly.  most  l)A(.'s  operate  on  the  input  binary  wool  using  a  parallel  conversion 
algorithm.  Increasing  die  length  of  the  input  word  to  a  l)AC  requires  a  linear  inc  rease  in  hardware  (up  to  a 
point)  to  keep  die  conversion  algorithm  a  constant  time  operation.  Increasing  the  woid  length  of  the  input 
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word  U)  the  processor  algorithm  requires  an  exponential  decrease  in  processing  time  for  a  fixed  sampling 
frequency.  I'his  exponential  time  versus  linear  hardware  tradeoff  hurts  the  microprocessor  considerably. 

As  this  example  shows,  current  microprocessor  implementation  of  some  hardware  analog  (and 


digital)  functions  is  still  very  limited  in  what  it  can  do. 


2.  Chapter  Two:  Background 


2.1  Prior  Research  in  Computer  Music  Opposition 

All  programs  that  attempt  to  compose  in  different  styles  (jn//.  classical,  modern)  use  parameters 
obtained  by  analysis  on  some  level. 

In  studying  methods  of  music  analysis,  one  secs  that  most  operate  by  music  (or  score)  decomposition. 
It  is  the  methods  of  decomposition  and  the  decomposition's  atomic  units  which  serve  to  differentiate  one 
analysis  proccedurc  from  another. 

Some  of  the  earliest  attempts  at  computer  composition  tried  to  view  music  scrially|4|.  They 
decomposed  music  from  left  to  right  and  composed  by  a  generate  and  test  proccedurc.  Notes  were  randomly 
generated,  and  allowed  to  pass  into  the  output  file  if  they  didn't  violate  local  constraints.  Such  mcdiods 
produced  reasonable  music  locally,  but  the  overall  structure  was  not  coherent.  Attempts  were  made  to  get 
around  this  problem.  One  method  involved  dividing  a  composition  into  sections,  and  having  strict  specific 
rules  foi  each  section.  This  provided  further  "filtering"  for  the  probalwlistic  note  generation  procecdure  and 
did  not  let  as  much  "noise"  through.  This  produced  somewhat  better  compositions,  but  they  were  usually 
locked  into  one  very  strict  style. 

Kadcr|7|  has  developed  a  method  for  composing  simple  rounds  using  a  set  of  composition  rules 
which  arc  monitored  by  set  of  meta  rules.  Hie  long  term  structure  in  this  work  was  therefore  a  function  of 
the  round  "style". 

Recently,  methods  of  musical  analysis  based  on  work  done  by  Heinrich  Schcnkcr|9|  have  been 
proposed  and  extended  by  Kasslcr[2|.  Smoliar(U|.  l.clmIahl-Jackcndoll|5|  and  others.  Kasslcr  and  Smoliar 
framed  their  ideas  in  terms  of  computer  programs.  Unfortunately,  detailed  results  of 'Kasslcr's  latest  research 
in  this  area  arc  not  availablc[.i|.  Smoliar  has  translated  some  of  Schcnkcr  s  work  into  a  series  of  I  isp 


functions.  A  small  manual  describing  the  functions  is  available!  1 3|. 

Hie  initial  gross  structure  of  the  analysis  method  proposed  here  was  developed  without  any 


knowledge  of  ilk  above  authors  work.  Ihcre  .110  stmtlaiilies  however.  As  die  gross  sti  aclure  is  described 
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below,  the  similarities  .nul  differences  vs  ill  be  described  as  they  arise. 

2.2  A  Model  of  Music  Perception 


Suppose,  as  one  listened  to  a  piece  of  music,  one  analysed  it  by  several  different  procedures 
simultaneously.  Suppose  further  that  these  analyses  could  not  nicely  describe  the  music  in  terms  of  their 
pinmtive  operators  thioughout  the  length  of  the  entire  piece,  that  is,  that  there  were  a  few  "rough  spots" 
present  in  each  procedme's  analysis.  What  are  some  of  the  details  one  would  notice  if  these  analyses  were  laid 
out,  measure  In  measure,  one  beneath  the  other? 

One  is  that  the  rough  spots  in  die  different  forms  of  analysis  would  occur  in  different  places  in  the 
music.  It  seems  clear  that  an  ultimate  representation  of  (well  composed)  music  at  the  perceptual  level  should 
be  free  of  these  tough  spots,  lo  get  rid  of  them,  one  can  take  advantage  of  the  multiple  analyses  by  tying 
them  together  into  one  global  representation  of  the  musical  piece.  This  can  be  done  by  sw  itching  front  one 
analysis  to  another  beloie  hilling  these  rough  spots.  This  process  can  be  thought  of  as  periodically  switching 
v  iew  points  of  a  piece  of  music  .is  the  piece  progresses.  An  analogy  that  might  be  made  here  is  one  of  walking 
along  side  a  cylindrical  crystal  prism.  Inside  the  prism  is  the  score.  As  we  proceed  down  die  prism's  length, 
we  look  through  the  facet  that  cm rently  gives  us  the  best  view  of  the  music.  ( I  Ins  analogy  is  reminiscent  of 
those  used  in  Minsky's  l  rames|.?l|  and  Sussman's  Slices)  ?J|.) 

As  these  representations  of  musical  knowledge  are  viewed,  problems  could  arise  as  one  proceeded 
down  the  crystal  column.  If  one  arbitrarily  switched  viewpoints,  one  could  switch  from  the  end  of  one 
representation  into  the  middle  of  another  representation.  This  is  a  problem  because  it  would  seem  best  to 
switch  from  the  end  of  one  representation  to  the  beginning  of  another.  We  will  make  switching  at  a  boundary 
a  consii.imt  in  the  system  of  viewpoint  analysis  that  produces  our  global  repicseutulion. 

Suppose  there  were  cases  wlieie  there  are  no  "beginnings  of  representations”  that  (lie  analyst  can 
switch  lo.  in  these  cases,  the  analyst  will  have  lo  "back  up"  011  some  collier  analysis  in  order  to  find  the  most 
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appropriatc  place  to  switch.  This  would  correspond  to  confusion  on  the  listener's  part. 

Sometimes  this  procedure  may  not  suffice.  If  dial's  the  case,  then  a  new  lexical  viewpoint  may  have 
to  he  funned  which  is  a  distortion  of  a  more  standard  lexical  viewpoint.  This  distortion  could  he  viewed  as 
"an  imposition  of  additional  constraints  on  the  syntactic  structure  of  an  earlier  lexical  analysis".  Realistically, 
this  is  nothing  more  than  altering  the  original  conception  of  reality  m  order  to  add  functionality  lo  the  model. 

As  an  aside,  one  recalls  that  one  of  the  key  problems  in  the  l  -  i  amc  paradigm  is  one  of  control.  I  low 
does  one  know  what  frame  to  pull  in  next  without  having  the  search  tree  grow  combinatorial!)?  Assuming 
that  this  model  is  valid,  investigating  how  good  composers  lead  people  into  switching  their  "viewpoints" 
without  confusion  as  a  score  progresses  could  lend  some  insight  into  this  problem. 

It  has  been  implicitly  staled  that  each  of  these  analyses  is  a  complete  description  of  die  piece.  Tor 
example,  one  analy  sis  may  he  melodic,  another  harmonic,  depending  on  how  the  listener  is  v  iewing  the  piece 
at  the  lime,  hut  each  completely  specifics  all  the  information  drat  the  listener  needs  to  know  at  that  moment. 
Can  each  of  these  viewpoints  he  further  subdivided?  It  is  assumed  here  that  the  answer  is  yes.  It  is  further 
assumed  that  each  of  these  "atomic  analyses”  arc  only  partial  analyses,  that  a  complete  description  of  the 
piece  can  only  he  obtained  by  combining  two  or  more  of  the  partial  analyses.  I  Tic  idea  of  many  partial 
analyses  is  central  to  the  work  of  I  .ehrdahl  Jaekcndoff  and  other  disciples  of  Schcnkcr. 

2.3  Methods  of  Analysis 

One  type  of  "atomic  analysis"  will  he  specified  here.  It  is  Tonal  Analysis,  which  is  transcription  of  a  part  of 
Schcnkcr's  work  into  a  context  sensitive  grammer  and  a  set  of  meia  rules. 

Some  view  musical  composition  as  a  tangle  of  interrelated  constraints.  It  is  suggested  here  that  the 
tangle  lie  attacked  by  grabbing  hold  of  several  pails  in  the  tangle  and  polling  them  apart.  Naturally,  they  will 
not  come  totally  apart,  there  will  he  connections  between  them.  (Picking  an  analysis  pmcecdurc  that,  in 
general,  minimizes  the  total  number  of  these  interconnections  may  be  a  good  heuristk  for  determining 
whether  a  given  "n”  part  description  is  heller  than  another  part  description.)  It  is  fell  that  focusing  the 
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total  analysis  around  those  "almosl-hicrarchies"  will  give  a  more  coherent  picture  of  musical  structure  than 
simply  specifying  the  constraints  connecting  them  together. 
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3.  Chapter  Three:  Some  Music  Theory 
3.1  Of  Strings,  Boards  and  Pegs 

Take  a  board.  At  either  end  of  the  board,  mount  a  peg.  String  a  wire  between  the  two  pegs.  Pluck 
the  wire.  A  sound  will  result.  Ill  is  sound  is  due  to  the  wire  vibrating  at  its  first  harmonic  frequency.  We  will 
give  this  sound  another  name,  we  will  call  it  the  wire’s  tunic  pilch.  (Pitch  and  frequency  arc  terms  that  will  be 
used  interchangeably.) 

Clamp  the  wire  to  the  board  at  its  midpoint.  With  the  wire  clamped  to  the  board,  pluck  one  side  of  it. 
A  sound  different  from  the  first  will  result.  This  second  sound,  produced  by  the  wire  vibrating  at  twice  the 
frequency  of  the  first  sound,  is  called  the  w  ire's  second  harmonic  frequency.  Sounds  w  hose  frequencies  differ 
by  one  factor  of  two  arc  said  to  be  an  octave  apart  in  pitch.  Sounds  whose  frequencies  differ  by  only  factors  of 
two  arc  said  to  belong  to  die  same  pitch  class.  'Ilicrcfore  die  first  sound's  frequency  and  die  second  sound’s 
frequency  belong  to  die  same  pitch  class  (that  of  the  tonic)  and  arc  an  octave  apart. 

Keclamp  the  wire  to  the  board  one  third  of  the  way  from  one  peg  to  the  other.  With  die  wire 
damped  to  the  board,  pluck  the  shorter  side  of  the  wire.  A  sound  different  from  the  first  two  will  result.  It  is 
the  result  of  the  wire  vibrating  at  three  times  its  original  frequency.  This  is  called  die  w  ire  s  third  harmonic 
frequency,  (  lie  third  harmonic  frequency  is  given  another  name.  It  (and  each  of  its  pitch  class  equivalents)  is 
called  die  wire's  dominant  pitch. 

In  general,  die  ndi  harmonic  of  die  wire  is  produced  by  clamping  die  wire  down  one  ndi  of  the 
distance  from  one  peg  to  the  other  and  plucking  the  shorter  side.  A  frequency  n  times  that  of  the  tonic  will 
then  result. 

flic  wire's  fourth  harmonic  frequency  is  one  octave  above  its  second  harmonic  frequency.  Therefore, 
the  wire's  fourth,  second  and  first  harmonic  frequencies  belong  to  the  same  pitch  class. 

The  wire's  fifth  harmonic  frequency  (and  each  of  its  pitch  class  equivalents)  is  given  a  special  name,  it 


is  called  the  wire's  mediant  pilch. 
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I'he  wire's  sixth  harmonic  frequency  is  the  pitch  one  octave  above  the  dominant.  The  eighth 
harmonic  frequency  is  an  octave  above  the  fourth  harmonic  frequency.  The  seventh  and  ninth  produce  new 
pitches. 

Now  that  some  of  the  relationships  between  the  first  nine  harmonics  of  a  vibrating  wire  have  been 
made  clear,  an  interesting  observ  ation  can  be  made  after  introducing  four  new  terms.  Iliese  terms  arc  octave 
membership,  scale,  pentatonic  scale  and  system  of  intonation. 

3.2  Scales  and  Systems  oflntonation 

If  the  frequencies  of  a  set  of  pitches  arc  divided  or  multiplied  by  two  until  they  lie  between  the  pitch 
designated  as  the  tonic  and  the  pitch  an  octave  above  the  tonic,  then  these  pitches  have  been  made  members  of 
the  some  octave.  If  these  pitches  arc  then  sorted  by  ascending  order  of  frequency,  they  form  a  scale.  If  these 
operations  arc  performed  to  the  first,  third,  fifth,  seventh,  and  nineth  harmonics  of  the  above  example,  the 
pentatonic  scale  is  produced.  (In  the  pentatonic  scale,  the  dominant  and  the  mediant  have  frequencies  3/2 
and  5/4  that  of  the  tonic.) 

In  order  to  perfonn  most  music,  one  must  choose  a  system  of  intonation,  that  is,  a  fixed  set  of  pitches, 
in  which  to  play  it.  Separate  cultures  around  the  world  have  independently  invented  and  used  systems  of 
intonation  based  on  the  pentatonic  scale  for  many  millennia.  Ihc  ancient  Scottish.  Chinese,  African, 
American  Indian,  Hast  Indian,  Central  American,  South  American.  Australian.  Finnish  and  Balancsc  cultures, 
just  to  name  a  few,  used  variations  of  the  pitches  forming  the  pentatonic  scale  in  their  music.  However,  no 
culture  uses  the  pentatonic  scale  in  its  pure  form.  I  he  pentatonic  pitch  varied  in  their  systems  of  intonation  is 
usually  the  one  based  on  the  seventh  harmonic.  This  means  that  although  the  scries  of  frequencies  found  in 
the  integer  harmonics  of  vibrating  objects  are  related  to  the  pentatonic  scale  (and  all  other  scales),  they  are  not 
the  hiiK  influence  that  produced  them. 

Documenting  all  the  influences  that  led  to  the  system  of  intonation  in  popular  use  today  is  beyond 
the  scope  of  this  thesis.  The  current  result  of  this  tonal  evolution  is  a  system  of  intonation  called  the  equal 
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tcmpcreJ  system.  The  equal  tempered  system  is  based  on  a  scale  of  12  pitches,  each  pitch's  frequency 
separated  from  that  of  its  neighbor  by  a  factor  of  the  twelfth  root  of  two.  (  Ins  twelve  pitch  scale  is  called  the 
chromatic  scale.  The  pitches  in  this  scale  .ire  called  members  of  the  chromatic  collection.  Members  of  the 
scale,  spanning  two  octaves  and  ordered  by  frequency,  arc  sometimes  designated  by  the  symbols: 

C  C tt  I)  Dtt  V  1-'  V  tt  G  Gtt  A  A#  B  C  CttlY  I jtt'  If  I  ’  V ft'  G'  G  tt '  A'  A  tT  B 

The  number  of  quotes! ' )  indicates  which  octave  each  pitch  class  belongs  to.  I'lie  twelfth  root  of  two 
factor  between  notes  is  called  a  semitone.  I'lie  term  "semitone"  is  used  with  words  denoting  distance,  l-'or 
example:  "  The  distance  between  any  two  adjacent  scale  members,  such  as C #  and  I).  is  one  semitone." 

3.3  Hie  Diatonic  Collection 

Seven  members  of  the  chromatic  scale  have  a  fundamental  role  in  many  styles  of  western  music.  These  styles 
of  music  emerged  before  the  time  of  Bach  and  continued  in  strength  till  time  of  Wagner.  I  licy  arc  also 
present  in  popular  music  today.  These  seven  members  of  the  chromatic  collection  form  a  group  called  the 
diatonic  collection.  The  diatonic  collection  can  be  constructed  using  the  first,  third,  fifth,  sixth,  eighth,  tenth, 
and  twelfth  members  of  the  chromatic  scale  (relative  to  the  tonic).  If  a  piece  of  music  uses  a  diatonic 
collection  chosen  io  this  way.  it  is  said  to  be  written  in  a  major  kc\.  A  scale  formed  from  these  members  is 
called  the  major  scale.  If  a  piece  of  music  is  written  in  G  major,  the  composer  chose  G  as  the  piece’s  tonic  and 
the  associated  major  scale  would  be: 

G  A  IIC  IT  I"  Vtf' 

The  semitone  interval  between  these  members  of  the  chromatic  collection  form  the  pattern 
2, 2.1, 2, 2. 2,1.  If.  relative  to  the  tonic,  the  music  produced  is  based  on  the  semitone  intervals  2.1. 2.2,1, 2,2  ,  then 
the  piece  is  in  a  minor  key.  Suppose  a  performer  played  a  falling  diatonic  scale  running  over  five  octaves  .  If 
one  came  in  after  the  performance  started  and  left  before  it  finished,  one  could  not  determine  if  (he  scale  had 
been  played  in  a  major  or  minor  key  (unless  it  was  indicated  by  the  perfoimer  "stressing”  ccitain  notes  as  he 
plaved).  The  reason  is  that  one  would  not  know  what  the  tonic  pitch  of  the  scale  was.  A  section  of  that  live 


octave  pattern  based  on  internals  is  shown  below: 

...1. 2,2.2, 1, 2,2, 1, 2,2.2, 1, 2,2,1, 2,2,2, 1... 

I  herfore,  one  can  conclude  that  the  concept  of  a  tonic  pitch  is  important  in  diatonic  music. 

Note  that  nothing  lias  been  said  yet  about  absolute  pitch.  As  all  the  frequencies  in  the  well  tempered 
sy  stem  of  intonation  differ  from  one  another  by  a  fixed  ratio,  one  need  only  specify  one  pitch  in  order  to 
specify  all  of  them.  "Standard"  pitchs  have  been  specified  several  times  in  the  last  several  hundred  years. 
I'he  tendency  of  each  new  standard  is  to  increase  the  frequency  of  the  system  used  up  to  that  point,  flic 
result  is  that  the  current  standard  is  almost  one  semitone  higher  than  the  standard  used  in  Beethoven's  time. 
This  also  means  that  Beethoven's  works  written  in  the  key  of  C  major  arc  now  performed  in  C  tt  major. 

3.4  Notes  on  Note  Notation 

Standard  music  notation  employs  a  floating  point  unary  representation  for  encoding  pitch(i.c.  Clefs 
and  notes  on  a  staff).  This  method  is  advantageous  for  performance  and  some  types  of  analysis.  It  will  not  be 
used  here. 

flic  method  used  here  is  geared  toward  representing  single  voice  music  written  totally  in  a  major  key. 
It  designates  pitch  names  with  the  numbers  l,2,3.4,5,(i  and  7.  leach  number  represents  a  member  of  the 
diatonic  collection  sorted  by  pitch  into  the  major  scale.  Mapping  this  representation  to  sound  is  easy.  If  the 
key  of  C  major  were  chosen  to  be  the  major  key,  1,  2,  3, 4,  5,  6,  7  and  1’  would  map  to  C,  I),  K,  f-’,  G,  A,  II  and 
C”  respectively. 

Close  relatives  to  the  tonic,  dominant  and  mediant  of  the  pentatonic  scale  arc  present  in  the  major 
scale.  I  hc  relative  to  the  tonic  is  labeled  "1",  the  dominant's  relative  is  labeled  "5",  and  the  mediant's  relative 
is  labeled  "3".  Remember  that  the  members  of  the  major  scale  arc  separated  by  the  semitone  pattern 
2,2,1, 2, 2. 2.1  ,  and  that  one  semitone  is  the  twelfth  root  of  two.  I  bis  means  that  the  ratio  of  the  major 
dominant's  frequency  to  the  tonic  is  2**(7/12)  =  1 .4983  (Recall  that  the  pentatonic  scale's  value  was  3/2  = 
1.5).  And  the  ratio  of  the  major  mediant’s  frequency  to  the  tonic  is  3**(4/l2)  -  1 .25*30  (Recall  that  the 
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pcnlalonic  scale's  value  was  5/4  -  1.25).  These  values  arc  so  close  that  the  distinction  "major  dominant"  or 
"pentatonic  mediant"  w  ill  no  longer  be  made.  Just  the  terms  tonic,  mediant  and  dominant  (equivalent  to  1.  3 
and  5  in  our  notation)  will  be  used. 

The  numbers  in  our  notation  will  also  be  called  scale  degrees.  The  octave  of  the  scale  degree  will  be 
notated  using  single  quotes  (  ’  ).  For  example;  2’  is  the  scale  degree  located  between  the  mediant  and  the 
tonic  in  the  second  octave. 

3.5  Intervals 

This  work  is  interested  in  composing  and  modeling  the  perception  of  pitch  progressions  in  major 
keys.  As  this  is  the  case,  it  is  first  necessary  to  determine  the  relationships  between  the  pitches  and  groups  of 
pitches  which  form  these  progressions,  such  as  melodies.  One  immediate  observation  is  that  the  vast  majority 
of  melodies  have  no  frequenc  y  jumps  spanning  more  that  an  octave.  (  This  is  due  in  part  to  the  limitations  of 
the  instrument  used  to  perform  most  melodies;  the  human  vocal  tract.)  Therefore,  in  examining  the  local 
relationships  between  two  pitches  in  a  melody,  one  could  construct  a  table  will-,  a  two  octave  span  of  the 
diatonic  collection  on  each  axis.  The  intersection  of  each  axis  entry  could  be  used  to  record  information  about 
the  experiment  performed. 

In  one  such  experiment,  it  was  determined  how  "unstable"  any  two  suecssive  pitches  sounded  when 
play  ed  one  after  another.  This  instability  could  result  from  one  of  two  feelings,  either  that  the  pitches  formed 
a  progression,  and  the  progression  was  incomplete,  or  that  the  two  pitches  simply  sounded  as  though  they 
didn't  belong  together. 

The  figure  labeled  "Intervals  Between  Members  of  the  Diatonic  C  ollection  Forming  the  Major 
Scale"  contains  the  condensed  results  of  this  experiment.  In  the  figure,  the  diatonic  members  forming  the 
major  scale  were  separated  by  their  semitone  distances  along  a  line.  Then,  .ill  line  segments  with  endpoints 
lying  on  a  diatonic  member  were  drawn  and  soiled  by  semitone  length.  As  pieviously  discussed,  the  seven 
diatonic  members  are  represented  by  the  numbers  I  through  7.  Their  pitch  class  equivalents  an  oc  tave  higher 
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aro  represented  by  the  numbers  1'  through  7’. 

The  results  of  the  experiment  were  tli.it  the  groups  tagged  with  black  squares:  m2.  M2.  a4,  dS,  tn7 
and  M7,  were  felt  to  he  unstable.  The  groups  not  tagged:  til.  m2,  M3.  p4,  p5.  mb.  Mb  and  p8  were  felt  to  be 
stable.  It  was  found  that  in  isolation,  all  members  of  each  group  were  perceived  as  being  equivalent.  As  can 
be  seen,  there  are  only  14  groups.  They  are  each  designated  by  a  letter-number  pair.  The  number  indicates 
how  many  different  members  of  the  diatonic  scale  arc  crossed  by  the  segment  (including  its  endpoints).  The 
letters  are  the  first  letters  of  names  given  to  the  groups.  It  is  not  necessary  to  know  the  names  to  understand 
the  work  presented  here,  but  they  are  presented  as  a  matter  of  interest.  The  complete  names  in  order  from 
the  lop  of  the  figure  are  unison,  minor  second.  Major  second,  minor  third.  Major  third,  perfect  fourth, 
augmented  fourth,  diminished  fifth,  perfect  fifth,  minor  sixth.  Major  sixth,  minor  seventh.  Major  seventh  and 
perfect  octave.  M2  and  m2  arc  the  minimum  length  intervals  shown.  The  number  of  these  intervals  contained 
in  each  segment  is  recorded  in  the  columns  labeled  "M2"  and  "m2”. 

In  addition  to  the  main  result,  this  figure  rellccts  two  empirical  observations  that  were  made,  l-'irst, 
the  order  in  which  the  pitches  are  played  does  not  matter.  The  succession  3  4  gave  results  equivalent  to  those 
obtained  from  4  1  The  second  observation  was  the  concept  of  octave  equivalence  of  interval.  1'his  means 
that  the  progression  3  4  gave  the  same  results  as  the  progression  3'  4’. 

These  three  empirical  observations:  limitation  of  octave  jump,  directional  equivalence  of  interval 
jump  and  octave  equivalence  of  interval  mean  lliat  there  are  only  5b  absolute  intervals  that  can  be  used  by 
melodies  whose  members  consist  solely  of  pitches  belonging  to  die  diatonic  collection.  (If  non-diatonic 
members  are  allowed,  the  number  of  possible  absolute  intervals  increases  to  15b  using  these  three 
observations.)  All  these  5b  intervals,  represented  as  line  segments  between  members  of  the  diatonic 


collection,  arc  shown  in  the  figure. 
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3.6  ITie  Ionic,  Dominant  and  Mediant 

The  title  of  this  section  lists  the  three  pitches  given  special  names  in  this  chapter.  This  group  of  three 
pitches  will  be  given  a  special  name  here,  they  will  be  called  the  ionic  trim/.  This  term  can  mean  different 
things  in  different  contexts,  but  here  it  will  refer  only  to  this  group  of  three  pitches.  Note  that  all  the  intervals 
between  members  of  this  group:  1-3.  1-5,  3-1’,  3-5.  5-T  and  5-3'  are  stable  intervals. 

Music  that  embodies  the  concept  that  these  three  tones  are  special  is  called  luiuil  music.  Theories  that 
attempt  to  say  things  about  such  music  arc  called  tonal  music  theories. 

Ihis  thesis  presents  a  small  tonal  music  theory.  The  main  argument  that  will  be  presented  is  that 
diatonic  pitch  progressions  produced  in  the  major  keys  have  a  common  mechanism  associated  with  them. 
The  key  feature  in  this  mechanism  is  that  attention  is  constantly  drawn  toward  and  away  from  the  three  pitch 
classes  that  belong  to  the  tonic  triad.  The  ramifications  of  this  theory  will  be  presented  in  the  following 
chapters. 

3.7  Summary:  Constraints  is  spelled  with  ’ai’ 

The  purpose  of  this  section  was  to  introduce  several  terms.  It  had  another  purpose  as  well.  Ihat  was 
to  present  some  of  the  local  physical  constraints  that  exist  in  a  major  key  melodic  line.  Assuming  that  the 
results  of  the  presented  experiment  demonstrate  "physical  constraints"  is  dangerous.  I  his  assumption  was 
not  made  lightly.  It  is  based  on  the  fact  that  the  "experiment"  describes  many  perceptions  that  have  been 
recorded  in  hundreds  of  thousands  of  documents  since  the  beginning  of  civilization. 

Hut  the  results  of  the  experiment  arc  a  function  of  time.  Only  a  subset  of  the  inters  als  now  viewed  as 
stable  were  considered  stable  1,000  years  ago.  And  only  a  subset  of  those  stable  a  1,000  years  ago  were 
considered  stable  1,000  years  before  that.  The  perception  of  stability  also  varies  from  culture  to  culture,  and 
sometimes  from  person  to  person.  However,  the  same  can  be  said  about  key  concepts  in  the  phonology, 
syntax  and  semantic  structure  of  language.  Mere,  as  in  all  AI  research,  one  must  work  with  what  material  one 
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has.  The  material  presented  in  the  figure  is  better  than  most. 

Knowing  the  physical  constraints  in  any  problem  is  important.  As  much  A.I.  work  has 
shown[22],|2 1)423],  when  making  a  breakdown  between  two  elements  that  arc  physically  related  to  each  other, 
it  is  best  to  make  die  breakdown  along  some  pcrcicved  physical  boundary.  The  initial  boundary  perceptions 
observed  in  this  thesis  arc  those  dial  have  been  described  here. 


wil  -  ■' 
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4.  Chapter  Four:  A  Tonal  Grammar 


Semper  idem  sed  non  eodem  modo 


4.1  Music 

In  contemporary  Western  Society,  "Music"  is  a  word  used  to  denote  a  large  number  of  different 
experiences.  The  catholic  meaning  given  to  the  term  is  probably  due  more  to  people's  inability  (or  lack  of 
desire)  to  discriminate  between  perceived  events  than  to  anything  else.  I  he  same  can  be  said  for  the  term 
"Music  llicory".  Therefore,  since  the  usage  here  is  very  specific,  a  definition  of  "Music  Theory"  will  be 
presented.  Recall  that  no  definition  can  be  "wrong"  (by  definition!),  however  it  can  prove  to  be  worthless.  It 
is,  of  course,  left  to  the  reader  to  judge  die  merits  of  the  definition  himself. 

A  Mu.su  theory  is  a  formal  system,  some  of  whose  objects  are  to  be  interpreted  as  auditory  events. 

Iliere  is  nothing  new  or  startling  about  the  above  definition.  It  is  essentially  a  recapitulation  of  the 
"Music  Model"  figure  with  the  added  constraint  that  the  indicated  procedures  and  symbols  satisfy  the 
requirements  of  a  formal  system.  It  is  stated  here  explicitly  in  order  to  still  the  majority  of  arguments  made 
concerning  ideas  in  this  work.  Once  it  is  stated,  the  only  real  argument  left  is  whether  or  not  work  based  upon 
tilts  viewpoint  is  woithwhilc. 

The  music  theory  presented  in  this  chapter  is  concerned  with  the  tonal  progressions  of  music  written 
in  the  major  mode.  At  the  beginning,  it  totally  ignores  questions  of  pitch  durartion  and  stress.  It  will  report 
observations  on  pitch  progressions  from  a  numbei  of  different  levels.  Representing  knowledge  obtained  from 
one  set  of  these  observations  is  the  subject  of  the  next  section. 


.  +r- 
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4.2  Grammars 

There  arc  a  variety  of  ways  to  express  knowledge  in  a  formal  system.  On  the  level  of  music  examined 
here,  knowledge  is  expressed  in  terms  of  production  rules.  Production  rules  are  used  because  on  a  local  lev  cl. 
music  consists  of  multiple,  non-trivially  different,  independent  states.  This  makes  it  feasible  to  write  multiple, 
non  trivial,  modular  ru!es(18|.  A  benefit  of  this  method  of  representation  is  that  it  draws  a  clean  line  between 
pieces  of  knowledge  and  the  processes  that  use  them.  This  allows  this  work  to  present  these  production  rules 
as  a  set  of  "design  rules"  for  a  subset  of  music. 

flic  work  upon  which  these  rules  arc  based  was  written  in  the  early  part  of  this  century  by  an 
Austrian  music  theoretician  named  Heinrich  Schenker[il|j()).  lie  and  his  modern  day  disciples  expressed 
their  ideas  in  what  arc  essentially  sets  of  context  sensitive  rules.  Some  of  the  rules  can  be  found  in  Rcgcner|8|, 
Kassler[Kasslcr75]  and  Smoliar[13]  as  well  as  in  Schenker.  The  subset  of  rules  presented  here  are  designed 
primarily  for  use  in  the  upper  line  of  a  piece  of  multivoice  music.  These  rules  are  expressed  procedurnlly 
below. 

All  music  has  a  main  structure.  This  stricture  is  called  the  fundamental  descending  line.  The 
fundamental  descending  line  has  one  of  three  forms,  they  arc: 

3  2  1,  5  4  3  2  1,  or  1’7  6  5  4  3  2  1 

Rules  govern  the  addition  of  pitches  to  (and  the  insertion  of  pitches  into)  the  fundamental 
descending  line.  These  rules  are  called  triadic  repetition,  neighbor  insertion,  triadic  insertion  and  step  motion. 

The  rule  of  triadic  repetition  states  that  one  pilch  may  be  replaced  by  two  successive  pitches  if  the 
pitch  is  a  triadic  member. 

The  rule  of  neighbor  insertion  states  that  any  repeating  triadic  member  may  have  one  pitch  one  scale 
degree  higher  or  one  scale  degree  lower  inserted  between  the  repeating  notes. 

The  rule  of  triadic  insertion  states  that  a  triadic  member  may  be  inserted  between  any  two  notes  if  no 


unstable  intervals  are  created. 
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Thc  mlc  of  step  motion  states  that  any  two  pitches  may  be  joined  by  a  complete  ascending  or 
descending  scries  of  intervening  scale  degrees  or  steps. 

The  figure  entitled  "Hart  of  J.  S.  Bach’s  Two  Part  Invention  Number  Fight"  shows  a  derivation  of  a 
well  known  pitch  progression  using  these  rules.  The  results  of  each  derivational  steji  is  indicated  in  bold 
letters.  It  is  incorrect  to  assume  dial  this  is  a  derivation  of  the  piece.  It  is  simply  a  convenient  way  to  show 
some  of  the  rules.  Note  that  the  triadic  repetition  must  precede  a  neighbor  insertion.  An  ordering  not  shown 
on  this  picture  is  that  triadic  insertion  prccceds  step  motion.  This  derivation  is  defined  as  being  a  synthetic  or 
top  dow  n  parse  representation  of  the  piece.  Another  representation  is  shown  at  the  bottom  of  the  figure,  this  is 
a  synoptic  viewpoint  of  the  piece,  as  no  ordering  on  how  the  derivation  was  done  is  shown.  A  third 
representation  would  be  an  analytic  representation  or  a  bottom  up  parse. 

An  abbreviation  of  the  rule  names  will  be  used  in  the  figures.  The  descending  fundamental  line  will 
be  labeled  F,  neighbor  notes  will  he  labeled  N,  triadic  insertion  will  be  labeled  I.  triadic  repetition  will  be 
labeled  R,  ascension  and  dcscension  will  be  labeled  A  and  I)  or  S. 

Ivach  of  the  above  rules  expresses  a  facet  of  the  idea  presented  in  the  last  chapter.  There,  it  was 
theorized  that  some  classes  of  tonal  music  use  methods  to  direct  attention  to  and  from  the  members  of  the 
tonic  triad.  The  rules  expressed  above  do  this.  Note  that  both  the  concept  of  step  and  neighbor  involve 
intervals  that  were  defined  in  the  last  chapter  as  being  unstable.  Ihe  system  of  rules  guarantees  dial  the 
unstable  intervals  will  come  to  rest  on  a  tonic  triad  member.  Ihe  concept  of  repetition  reenforces  the 
presence  of  a  triad  member.  Ihe  fundamental  descending  line  connects  members  of  the  triad  by  step  motion. 
Triadic  insertion  guarantees  that  the  major  sections  of  the  progression  are  reinforced  with  members  of  the 
tonic  toad.  From  the  fundamental  line  to  the  most  local  structure,  it  is  dear  that  Ihe  concept  of  stability, 
instability  and  resolution  to  the  tonic  triad  are  embedded  deeply  into  the  li.imework  of  the  system. 
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4.3  M:ijor  I 

These  nilcs  arc  all  defined  with  known  terms.  It  is  easy  to  suite  them  using  a  grammar  in  terms  of 
pitch.  This  is  done  in  the  figures  entitled  "Major  I". 

This  grammar,  like  all  others,  consists  of  four  things:  terminals,  non- terminals,  product  urns  and  the 
sentential  symbol. 

Here,  S  is  die  sentential  symbol,  die  key  symbol  from  which  the  tonal  representation  is  derived.  ITie 
production  using  the  sentential  symbol  consists  of  the  string  of  symbols  following  the  sentential  symbol.  Ihe 
arrow  means  "can  be  replaced  by".  ITie  dark  vertical  bars  located  on  the  right  hand  side  ol  lhe  productions 
represent  die  "or"  operation.  The  Qi  sign  simply  designates  where  the  melody  starts.  This  means  dial  the 
descending  fundamental  line  production  can  be  read  as: 

S  can  be  replaced  by  (fiT'.DC  or  (^GITI)C  or  (fKTlAGKKIX-. 

The  lowercase  numbers  are  terminal  symbols.  They  are  what  die  non-terminal  symbols  (the  letters 
in  bold)  must  eventually  resolve  to.  T  his  means  that  the  first  triadic  repetition  rule  can  be  read  as-  C  can  be 
replaced  by  two  Cs  or  by  the  tonic. 

The  triadic  insertion  rule  is  essentially  a  grammatical  implementation  of  the  interval  table  presented 
in  die  last  chapter.  A  shorthand  notation  is  used  to  express  this  knowledge.  ITie  set  membership  symbol 
indictes  that  the  designated  non-terminal  can  resolve  to  any  terminal  in  the  brackets.  ITie  information  could 
have  been  represented  just  using  symbol,  but  this  version  was  viewed  as  being  more  concise.  Note  diat 
although  this  notation  has  die  disadvantage  of  indicating  the  resolution  of  non -legal  intervals!  i.c.  4  4.  4  6  or  4 
5’  in  the  UV  production  for  example),  these  intervals  will  never  appear  in  the  siring  due  to  the  structure  of  the 
language. 

In  this  grammar,  the  dark  symbols  in  the  ascension -dcsccnsion  productions  are  non  terminals.  In  (he 
ascension  desccnsion  productions,  a  shorthand  notation  for  the  concept  of  octave  equivalence  is  used.  ITie 
quoted  (')  symbols  designate  that  both  sides  of The  indicated  production  are  to  be  raised  in  pilch  one  octave. 
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For  example,  J'  refers  to  the  J  production  where  all  entries  to  die  right  of  the  arrow  are  raised  one  octave  in 
pitch  to  4'.  5',  6'  7'  and  1".  Kach  of  the  rules  arc  octave  equivalent,  dial  is.  each  side  can  be  quoted  or 
unquoted  at  will  in  order  to  produce  a  new  rule.  However  one  must  quote  both  sides  when  doing  this.  1  docs 
not  produce  IT  for  example. 

Ill  the  neighbor  insertion,  ascension,  dcscension  and  the  triadic  insertion  productions,  two 
non  terminals  appear  on  the  left  hand  side  of  die  production.  I  liis  means  that  die  rules  arc  context  sensitive. 
If  there  is  only  one  non  terminal  on  the  left  hand  side  of  the  production  (and  nothing  else),  as  in  the  triadic 
repetition  rule,  the  rule  would  be  context  free.  I  hc  distinction  between  the  two  becomes  apparent  when 
constructing  automatic  procedures  for  the  analysis  of  strings  generated  by  these  rules.  Automatic  analysis 
procedures  oi  bottom  up  parsers  are  much  easier  to  construct  for  context  free  procedures  dian  for  context 
sensitive  ones. 

Note  that  this  grammar  consists  solely  of  string  lengthening  rules.  Hits  means  that  one  can  always 
determine  whether  oi  not  a  sentence  is  a  membei  of  the  grammar  by  a  very  simple  procedure.  Ilic  procedure 
is  to  liist  note  die  length  of  the  sentence  in  question.  Next,  view  derivation  as  a  "tree"  with  the  sentential 
symbol  .is  the  root,  (irow  branches  on  this  tree  from  die  central  root  out  by  applying  each  of  the  possible 
rule  s  whenever  applicable.  Chow  die  tree  until  all  of  the  productions  on  the  outei  branches  Jie  the  length  of 
the  input  string,  f  inally,  compare  die  input  string  to  all  the  branches.  If  a  match  is  found,  then  the  string  is  in 
die  language.  If  no  string  matches,  it  is  not.  This  procedure  is  guaranteed  to  work,  because  it  generates  all  the 
top  down  parses  the  length  of  the  input.  Hie  procedure  is  guaranteed  to  terminate  because  all  rules  lengthen 
the  string,  none  shorten  it  or  leave  it  the  same  si/c. 

I  be  system  looks  powerful,  but  power  of  one  sort  must  usually  be  traded  oil  fur  powei  of  another 
soil  Consider  the  mnsn.il  example  labeled  "I  winkle.  I  winkle  I  itlle  Stai”  Note  the  lepc.ited  sixth  lx  until 
and  second  deg/ces,  I  he  picsvnlcd  vision  is  siniplv  nn.ip.ihlv  ol  piodm  mg  them,  ami  heme,  incapable  of 
piudocing  any  composition  containing  them.  I  Ins  In  mgs  us  to  the  next  result 

I  he  Uii/or  I  sislem  n  consistent,  lh.it  is  if  the  leimmals  of  am  given  suing  piodm  cd  m  die  system 
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,iu'  interpreted  .in  being  the  sonic  degrees  of  n  major  mode  melody,  then  only  tonal  compositions  arc 
produced. 

Completeness  is  defined  as  being  the  property  that  every  "true"  suing  in  the  system  is  pioduc.ihle 
from  the  axioms  (the  sentential  production)  and  the  rules  of  inference  (the  other  productions).  1  hr  Major  / 
si  s/i  in  is  metmtph  ir.  t  hat  is.  there  are  tonal  compositions  that  cannot  be  produced  using  the  grammar. 

file  Major  I  system  is  guaranteed  to  produce  only  tonal  compostinns.  but  it  ssill  not  pioduce  all  of 
thorn,  litis  means  that  the  system  is  not  posseiliil  enough  to  justify  the  mieiprei.iiion  of  being  a  total  tonal 
pr>  igiession  grammar. 

lo  gel  a  belter  perspective  on  this,  let  s  consider  the  geiici.il  pioblem  ot  grammatical  mlcienccj l‘)| 
ciiaiimuiic.il  inference  is  the  pioccss  of  interring  the  grammai  ot  a  language  when  only  stimgs  pioduced  in 
the  language  aie  allowed  to  be  used  in  the  infeicnce.  In  older  to  uilci  any  giammar  from  music,  one  must 
have  source  material  to  work  with.  It  is  interesting  to  note  that  iltr  go tnunatual  tn/>  rater  pmblem  is 
nnso/wihle  lor  most  fjrnrttil  gr, iiimuirs.  In  ordei  to  infer  a  grammar  from  music.  01  to  perform  any  ev|uisalent 
opeiation.  working  purely  from  examples  is  insufficient  One  must  fust  have  a  defmilion  of  what  is  music 
nid  vslr.it  isn't  l  nt*n innately .  like  "lose"  and  "intelligence",  "music"  is  a  leiin  where  "definitions"  lead  to 
.iigunients  not  enlightenment.  One  of  the  basic  complaints  made  ol  Schenkci  in  Ins  own  day  w.is  that  lie 
Hied  to  state  what  music  wasn't  It  tinned  out  that  Ins  definition  fit  the  compositions  that  the  major 
1 1 miposeis  ol  his  day  weie  piodni nig.  IVihaps  this  is  why  Ins  theories  were  not  as  respected  in  his  own  day  as 
dies  aie  (bs  ni.niy )  tod. is  Since,  in  geneial.  no  one  wishes  to  make  iioimuiise  statements  ahoul  wh.ii  is  music 
ind  what  isn't,  then  it  must  follow  that  no  yrnmniolienl  wslnn  (or  wslnn  that  tan  !«  trnnsfoinirJ  into  , I 
:  #  owl, 1 1  l  ./,'  SI  si,  'll)  M  /  li  //«(  ,;/'/(  /<  I  i  Ollipli  It  71  "i  split  1 1 1"  'tins I  lie  |>Olllt  ol  tills  Is  111  *t  tll.lt  I  llllslc  theory 
IS  I  de  id  In  III  It  is  lli.it  "lliiisn  "  Itsell  lias  such  a  bi.iad  liiteipiel.ilion  .ins tinny  lli.il  pmpoits  to  explain  it  is 
pi i  ax ib Is  doomed  in  t.iilui e  I  lie  best  that  one  i.m  hope  to  do  is  lliei n i/e  about  i ci  tain  aspei  Is  ol  n  .mil  see 
b, ,«  big  ,1  .  hunk  .  in  bi  expl  mi.  .1  a  ang  is  small  a  ssstem  as  possible  I  Ins  is  done  in  die  in  st  sex  lion  loi  the 
.li  s  ■  it  mu  ii  sin  os  n  hcie 
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4.4  Major  II 

The  Major  I  grammar  seems  a  little  unwieldy  at  points,  particularly  the  last  set  of  productions.  Ilns 
can  be  interpreted  a  number  of  ways.  One  is  that  grammars  arc  not  the  best  way  to  implement  the  ideas 
shown  here.  Maybe  the  procedural  method  presented  at  the  beginning  was  "optimal". 

Hut  this  would  also  violate  the  initial  intuition,  that  local  state  can  be  characterized  by  simple  rules. 
I’crhaps  these  productions  will  seem  simpler  if  viewed  from  another  perspective.  In  the  Major  Melody  1 
grammar,  each  tone  was  viewed  as  an  object.  If  instead,  the  intervals  hetween  the  tones  are  viewed  as  objects, 
then  .mother  grammar  can  be  constructed.  This  is  shown  in  the  Major  Melody  I  la  grammar.  Note  that  some 
of  the  rules  are  now  written  using  a  different  si/e  font.  The  smaller  font  is  used  to  express  the  concept  of  step. 
I  he  unit  step,  designated  as  a  1  for  a  rising  step  and  a  -1  for  a  falling  step,  is  the  only  terminal  for  die  interval 
grammar.  Ihe  other  numbers  .ire  nonterminals  expressing  the  step  si/e  between  two  notes. 

Now  we  sec  something  rather  interesting.  All  the  productions  aie  defined  in  terms  of  step  or  pitch, 
tliat  is.  all  except  one.  Ihe  triadic  insertion  rule  is  defined  in  terms  of  pilch  and  jump  (a  "jump"  is  a  change 
of  pitch  larger  than  a  step).  (  Ins  leads  to  the  observation  that,  excepting  the  triadic  insertion  rule,  all  the  rules 
are  co lUcxl  free  when  mlerpn  leJ  m  ilir  ri^ln  Jimiain.  It  is  proposed  here  that  the  rules  be  divided  into  three 
types,  those  th.it  look  at  pitch,  those  that  look  at  intcival.  and  those  that  provide  a  switch  between  the  two. 
I  he  triadic  insertion  rule  is  of  the  last  type.  It  operates  hy  doing  two  things,  inserting  an  interval  greater  than 
a  stop,  and  leading  the  interval  to  a  tonic  triad  member.  Since  it  operates  in  this  switch  role,  the  triadic 
insertion  rule  is  apart  from  the  others,  it  may  he  thought  of  as  a  metarule  It  k  not  the  only  possible 
switching”  metarule. 

Ihe  lesult  of  the  context  frccncss  of  certain  classes  of  music  has  n<U  been  obtained  previously.  It  is 
similar  to  a  i  l.nm  unieiitlv  made  in  the  field  of  linguistics,  namely  that  enplish  syntax  can  be  described  by  a 
(.unit  xt  free  giammar.  as  long  as  syntax  is  not  used  to  annum  lui  legul. tunes  hcttei  altnhutcd  to  semantics. 

I  be  tail  nl  seniaiitn  .  nivilcd  it>  this  case  is  whetlni  ll  is  best  to  look  al  (lie  notes  m  to  look  between  them. 


The  render  may  be  a  little  confused  at  this  point.  First,  the  rules  were  context  sensitive,  for  a  second 


they  seemed  context  free,  and  now  they  arc  a  combination  of  both.  The  confusion  can  be  cleared  up  quite 
easily.  A  normal  formalism  consists  of  something  else  besides  objects  and  rules,  it  also  consists  of  an 
interpretation.  The  above  result  seems  to  point  to  the  worth  of  determining  a  formalistic  way  of  showing 
when  to  switch  interpretations  on  die  same  object. 

4.5  Why  Two  Grammars  are  Better  than  One 

This  multiple  grammar  model  affects  the  number  of  sentences  gcneratablc  by  the  system,  l  ake  the 
system  consisting  of  parts  A  and  It  in  the  figure  labeled  "Multiple  Grammars".  Split  (la  into  two 
independent  grammars;  G1  and  G2.  Consider  die  resulting  total  system  consisting  of  the  grammars  listed  in 
parts  A  and  C.  Decomposing  the  melody  in  part  K.  it  is  seen  dial  the  sentences  in  the  language  are 
constructed  of  "parse  segments".  Only  one  of  die  two  grammars  generate  the  terminal  symbols  of  each  parse 
segment.  The  length  of  each  segment  is  designated  by  the  11011-bold  symbols  starling  with  the  letter  "s". 

Note  that  Grammar  Ga  can  generate  8**N  sentences  of  length  N.  Grammars  G1  and  G2  can  only 
generate  4**N  different  melodies  apiece.  So,  parsing  the  melody  with  two  different  grammars  may  alter  the 
number  of  gcneratablc  melodies. 

What  is  the  exact  relationship?  Consider  G  Grammars,  each  containing  X  terminals  (As  diagrammed 
in  part  D).  1  et  each  grammar  generate  melodies  witli  N  total  notes  containing  I’  parse  segments  of  arbitrary 
length  (the  lengths  of  the  segments  need  not  be  equal).  Ihcn  there  arc  (G**I>)*(X**N)  total  melodies 
possible.  For  the  example  shown  here,  die  ratio  of  die  gcneratablc  strings  in  the  Ga  system  to  die  gcneratablc 
strings  in  the  G1  and  G2  systems  is  (8**N)/(2**P)*(4**N)  =  2**(N-P).  The  point  of  diis  exercise  is  to  show 
lh.it  by  using  different  grammars  to  generate  different  sections  of  a  melody,  one  can  reduce  the  number  of 
gcneratablc  melodies  by  an  exponential  factor.  Note  that  diis  reduction  will  only  occur  if  the  grammars  arc 
selected  so  that  P  is  significantly  less  than  N. 

Although  the  number  of  gcneratablc  melodies  may  be  less  for  decomposed  grammars  Ilian  for 


-38- 


undecom  posed  grammars,  there  is  a  problem  with  die  above  argument.  It' the  number  of  grammars  in  the 
example  had  been  4  instead  of  2  and  the  grammars'  4  terminals  were  still  members  of  the  undecomposed 
grammar's  set,  then  the  upper  bound  on  generatable  melodies  would  have  been  16**N.  This  doesn't  mean 
tliat  decomposed  grammars  generate  more  melodics  than  die  undecomposed  grammar,  indeed,  the 
undecomposed  grammar  completely  spans  the  space  of  all  possible  melodies,  so  that  would  be  impossible. 
Rather,  it  means  that  one  can  obtain  ambiguous  parses.  Since  grammars  (31  and  (32  are  completely 
independent  (they  have  no  terminals  in  common),  ambiguous  parses  don't  occur  with  the  above  two 
grammars. 

t  herefore,  in  order  for  the  decomposition  to  reduce  the  number  of  generatable  strings,  the  grammars 
should  be  teasonably  independent,  (and  hopefully  the  ambiguous  derivations  have  meaning). 

4.6  A  Theory  of  Music  Perception  -  Major  lib 

Major  I  la  has  been  slightly  modified  and  is  now  presented  as  Major  lib  1'he  neighbor  note  insertion 

in  this  grammar  is  interpreted  as  being  a  restatement  of  a  tonic  triad  member  via  an  adjacent  scale  degree,  as 

opposed  to  being  an  interval  production.  Note  that  in  addition  to  being  context  free,  the  grammar  has  been 

structured  to  produce  Icthnvsi  < Intuitions.  This  grammar  can  be  said  to  present  a  partial  model  of  music 

understanding  because  it  allows  the  listener  to  parse  the  music  as  it  is  being  heard. 

As  music  is  played,  expectations  are  cieated  by  the  appearance  of  certain  notes  and  arc  fulfilled  by 

othets  which  create  their  own  expectations.  Hi  is  grammar  models  these  concepts  i|uilc  well.  I  ct's  look  at  the 

) 

two  part  invention  number  eight  again.  In  hearing  the  initial  1,  the  listener  knows  that  it  is  either  a  member 

of  the  fundamental  descending  line,  the  first  part  of  a  triadic  lopcit.inii  or  a  ncighhoi  insertion.  In  hearing  the 

/ 

('  ' 

second  note,  he  knows  that  it  must  be  the  result  of  a  triadic  insertion.  In  hearing  the  third  note,  lie  knows  that 
the  first  note  was  repealed,  this  means  (hat  the  original  1C  production  resulting  from  the  first  has  been 
evaluted  completely  and  the  expectation  of  a  neighbor  note  resulting  from  it  can  be  thrown  away.  However, 
the  third  note  can  have  a  neighbor  or  a  repetition  .  and  so  on  till  the  sixth  note.  I  lei c,  the  switch  production 
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lias  been  invoked,  and  the  seventh  note  is  a  descending  step.  I  he  interpretation  changes  and  the  I) 
production  in  the  descending  step  is  activated  along  with  the  possibility  of  a  neighbor  note  production. 
Another  step  occurs,  this  1'  is  not  going  to  be  reenforced,  so  the  IK"  possihilly  is  not  evaluated.  Two  more 
steps:  w  ill  the  5  be  reenforced?  Yes!,  and  so  on. 

This  is  all  fine  and  good,  but,  the  above  paragraph  did  not  list  all  the  possibilities  that  were  valid  at  a 
given  instance.  It  did  not  reflect  the  fact  dial  Bach  did  the  listener  a  favor  in  the  first  few  notes  by  establishing 
where  the  tonic,  mediant,  and  dominant  were.  It  did  not  account  for  the  possibility  that  a  neighbor  note  to 
the  first  1,  might  occur  after  the  second  1  had  been  stated,  and  so  on.  Worse  yet.  there  was  nothing  in  die 
grammar  to  prevent  termination  of  die  descending  step  at  die  sixth  scale  degree  (Note  that  Major  I  didn’t 
have  diis  problem!).  All  of  these  tilings  could  be  implemented  in  the  Major  lib  grammar.  None  of  them 
were.  The  reason  is  very  simple,  flic  grammar  would  have  been  much  longer  if  this  had  been  done.  A 
grammar  model  can  represent  all  aspects  of  musical  structure,  but  it  cannot  represent  them  well.  The  aspects 
that  it  can  represent  well  have  been  presented.  In  die  work  here,  the  grammar  was  transformed,  but  it  was 
never  (and  probably  will  never  be)  expanded  much  beyond  this  level. 

This  argument  answers  a  question  raised  when  analyzing  music  by  "layers".  If  we  view  tonal 
structure  as  being  the  elaboration  of  a  basic  concept  through  many  levels,  should  the  elaboration  primitives  be 
the  same  for  each  lev  el?  Perhaps  for  model  of  music  cognition  above  die  feature  detector  level,  a  convincing 
argument  for  the  lioniogcnicly  of  elaboration  operators  on  all  levels  can  be  made.  Merc  it  can  not. 

What  will  the  elaboration  operators  be?  Mow  is  the  above  knowledge  used?  This  is  where  die 
concept  of  control  procedure  and  meta  rule  conics  in  again.  One  example  meta  rule  is  that  a  neighbor  note 
association  is  made  to  the  last  appropriate  triadic  member  beard.  Another  is  that  a  descending  step  may  not 
end  on  a  non-in. ulic  member  unless  the  last  step  is  used  in  a  neighbor  note  configuration.  What  other 
mcla  rules  should  be  used  here?  I  Expectation  of  common  cliches,  "common  sense",  rhythmic  cues,  sound 
intensity  and  other  methods  can  lie  used  to  decide  what  to  expect,  and  when  to  throw  away  unevaluated 
non  terminals.  Investigation  into  this  is  starting,  but  current  results  will  not  be  reported  here. 
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4.6.1  Confusion,  Boredom  and  Interest 

If  this  ihcory  is  true,  what  are  the  ramifications?  One  is  that  we  can  anticipate  what  boredom, 
confusion  and  interest  would  correspond  to  in  the  model. 

A  person  can  be  bored  or  confused  on  many  levels.  If  the  rules  are  learnablc  on  a  basic  level,  then 
those  who  don't  possess  these  production  rules  won’t  have  a  "deep  understanding"  of  the  Major  II  music  (hey 
hear.  They  will  hear  it,  but  they  won't  be  listening  to  it.  They  could  be  bored  by  the  whole  tiling. 

The  idea  of  boredom  also  brings  to  mind  the  idea  of  interest.  Major  II  is  wnbigious.  I  hat  means  that 
many  productions  can  be  derived  in  many  different  ways.  An  example  is  shown  in  a  fugal  theme  in  the  figure 
labeled  "Cherubini".  It  is  not  clear  whether  to  view  the  1  or  5  as  repeating,  l  ulfilling  expectations  in  a  way 
that  is  logical,  but  not  expec  ted,  can  be  construed  as  corresponding  to  a  mechanism  of  interest. 

Hie  figure  entitled  "More  of  Bach's  I  wo  I ’art  Invention  Number  8"  shows  just  that.  Some  of  the 
concepts  expressed  hcie  are  seen  winking  in  the  diagram.  Bach  opens  the  piece  by  establishing  the  key.  Note 
the  ionic  tepeiiiion.  Next  he  reenforces  all  members  with  a  descending  step  and  neighbor  motion,  flic  dark 
lines  preceed  areas  where  the  music  is  given  rhythmic  stress.  So  die  tonic  is  used  to  end  the  step,  it  is 
reenforced  rhythmically  and  is  used  to  launch  the  next  section.  All  this  is  important  because  it  is  questionable 
where  the  music  going  at  tins  point.  Is  3  or  5  repeating?  We  sec  that  5  wins  this  round  as  it  is  stressed  by 
neighbor  motion. 

4.7  A  1  henry  of  Music  Improvisation 

One  can  expand  the  Major  II  grammar  in  a  left  to  oglu  ouler  "on  the  fly"  if  one  remembers  the 
tmevalualed  terminals. 

lanhiacmg  the  concepts  presented  thus  far  would  seem  to  imply  the  following  model  of 
improvisation.  Ihe  improviser  must  have  three  things  with  which  to  woik.  l  iist,  several  sets  ol  local  rules, 
such  as  the  set  presented  above.  Second,  a  store  which  records  the  productions  not  yet  been  expanded. 


Third,  .1  sot  of  pattern  parameters,  acquired  as  the  improvisation  proceeds,  which  helps  decide  which  of  the 
sei  of  local  rules  to  apply  next.  How  large  a  repertoire  rtf  rules,  coupled  w  ith  how  big  a  store,  coupled  with 
what  (and  how  big)  a  set  of  pattern  parameters  are  required  to  do  "good"  improvisation?  Can  a  tic  between 
understanding  a  sentence  and  improvising  music  be  found?  If  so,  then  die  w  ork  of  Marcus  |20|  could  be  used 
in  this  problem  to  see  if  improvisation  is  I.R(J). 

4.8  The  Minor  Mode 

To  conclude  this  chapter,  some  productions  for  the  minor  mode  are  introduced.  They  basically 
constitute  a  set  of  exceptions  to  the  rules  previously  discussed.  These  rules  arc  listed  in  the  figure  entitled 
"Minor  Additions".  In  these  rules,  the  symbol  "  means  that  the  frequency  of  the  proceeding  pitch  should 
be  raised  one  semitone. 

In  looking  at  the  rules,  it  is  easier  to  see  why  there  are  two  separate  sets  of  rules  for  step  motion. 
Apparently,  ascent  is  different  than  descent  in  the  minor  mode.  T  he  reason  for  diis  lies  in  the  minor  scale 
itself.  Recall  that  the  minor  scale  interval  steps  form  the  pattern  2. 1,2.2, 1.2,2  and  the  major  scale  forms  die 
pattern  2, 2. 1, 2, 2, 2.1.  There  exists  a  scale  called  the  melodic  minor  which  uses  die  minor  pattern  when 
descending  the  scale,  but  when  ascending,  it  uses  die  pattern  2, 1, 2, 2, 2,1.  Hie  ascent  uses  die  raised  sixth  and 
seventh  scale  degrees  in  order  to  make  clear  the  direction  of  the  step  motion.  A  final  example,  Bach's  Bourec, 
is  shown  derived. 

4,()  Directions  for  Rirllier  Research 

A  broad  set  of  metarules  is  needed  to  extend  this  model.  I’nrlial  research  has  been  done  on  (his,  hut 
the  results  will  not  be  reported  here. 

T  he  grammatical  inference  methods  of  Kaisei|l')J  could  be  used  to  expand  the  grammar.  In  her 
paper,  she  parsed  Morse  code  conversations  using  a  limited  class  of  grammatical  productions.  When  a  new 
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production  was  encountered,  a  special  grammatical  extension  procedure  was  invoked.  One  example 
extension  that  is  possible  in  the  present  grammar  is  the  inclusion  of  a  double  neighbor  production: 

K  -->  KDKF. 

However,  even  for  a  grammar  that  only  produces  monophonic  tonal  progressions,  the  present  set  of 
productions  are  \cry  limited.  Therefore  it  may  he  difficult  to  use  this  set  of  productions  as  a  basis  for  such 
work.  Perhaps  a  different  "working  set"  of  grammatical  productions  can  be  developed  based  on  a  set  of 
different  rules.  One  could  then  develop  procedures  for  switching  between  one  working  set  and  another. 

The  addition  of  rhythm  and  multiple  voices  is  also  a  logical  next  step  for  other  researchers  in  this 

field. 

I  low  ever,  the  real  direction  should  be  obv  ions  to  each  individual  researcher  as  lie  or  she  realizes  how 
music  theory  can  benefit  their  own  interests. 

4.10  Conclusions 

It  has  been  shown  that  a  class  of  music  theoretical  rules  used  in  the  study  of  composition  can  also 
account  for  musical  perception  and  improvisation. 

I  inlike  many  papers,  the  purpose  of  this  one  is  not  to  present  a  result.  Its  purpose  is  to  prove  a  point. 

I  he  point  is  that  music  is  a  valuble  domain  for  Al  research. 

l  or  example,  recall  (he  "frames”  paradigm,  l  ew  microworld  models  can  state  a  rcsonablc 
mechanism  for  switching  view  points.  In  showing  that  a  jump  larger  that  a  step  causes  the  expectation  of  some 
change,  this  paper  comes  close  to  having  this  capability.  It  would  not  have  been  possible  were  it  not  for  the 
simple  nature  of  the  microworld  model  chosen. 

litis  microworld  system  can  be  (and  is  being)  developed  further.  In  the  course  of  that  development,  I 
suspect  that  this  work  will  be  quickly  exposed  as  a  wrong  approach  and  that  the  ideas  presented  here  will  give 


way  to  the  use  of  some  other  cognition  modeling  concept.  Hut  then,  that's  the  whole  idea.  Something  new 


will  be  learned  and  ihe  pieces  of  die  previous  work  can  be  salvaged  for  die  new  model. 

Marvin  Minsky  once  said  dial  writing  "Perceptrons",  the  book  he  co-authored  with  Seymour  I'apert, 
may  have  been  a  mistake.  That  in  die  book,  many  of  the  simple  questions  about  Perceptions  were  answered. 
This  deprived  workers  in  the  field  of  handholds  when  approaching  die  subject. 

This  thesis  certainly  doesn't  have  diat  problem.  The  work  presented  here  docs  not  consider  any  of 
the  interesting  subjects  in  music  deeply,  although  it  does  lay  the  groundwork  for  studying  some  topics.  I 
sincerely  hope  that  readers  will  use  this  groundwork  (or  develop  their  own)  as  a  tool  to  investigate  the  field. 


-44- 


5.  Chapter  Five:  Hie  Inexpensive  Synthesizer  System 

The  purpose  of  the  inexpensive  synthesizer  is  to  provide  an  inexpensive  tool  that  allows  one  in  keep  a 
record  of  any  keyboard  performance  done  while  away  from  the  mainframe  computer.  The  system  must  be 
complete  and  portable.  This  rules  out  standard  keyboard  instruments,  such  as  the  piano.  It  must  also 
interface  easily  to  other  equipment  used  in  the  project.  These  constraints  dictate  that  the  inexpensive 
synthesizer  produce  sound  electronically. 

The  figure  labeled  "Music  Synthesizer  Architectures”  illustrates  some  possible  music  synthesizer 

designs. 

In  design  A.  one  records  the  audio  signals  of  the  actual  performance  and  has  the  computer  transcribe 
the  recording  into  its  Performance  Schedule  using  "The  Far"  link  shown  in  the  "Music  Model"  figure.  Ihe 
signal  processing  problems  encountered  when  trying  to  extract  the  pei form. nice  parameteis  from  a  given 
signal  are  immenscj.Vl]  and  preclude  the  use  of  this  approach. 

It  is  easier  to  interpose  some  mechanism  that  actually  records  what  the  performer  is  doing,  as 
opposed  to  the  sounds  that  he  is  producing.  Design  It  illustrates  tins  idea.  I  he  "processor”  could  be  random 
logic  wired  to  to  perform  die  appropriate  algorithm,  or  it  could  he  a  general  purpose  device.  When  recording 
parameters  in  ibis  manner,  one  must  remember  dial  a  good  peifoimcr  will  compensate  when  playing  a  bad 
instrument  in  order  to  produce  the  sounds  he  wants.  Therefore,  when  using  the  per fm in, mee  parmeters 
extracted  by  the  processor,  it  is  important  to  know  what  die  peiformci  actually  heard  .is  he  played.  As 
electronic  synthesis  allows  a  l.nge  degree  of  control  ovei  the  actual  sound  piodiiciion,  du  re  should  he  no 
disp.ir ily  between  the  produced  sounds  and  recorded  performance  p.uameters  m  ,m  olcclionic  synthesizer. 

It  would  he  nice  il  the  processor  could  peifoini  both  the  kevho. ud  vanning  algoiithiu  and  the 
synilk'sis  mutiiics  as  m  design  (  .  Ilowcvei,  c*cn  if  some  of  die  comnnuilv  available  X  bit  mic  lopmc essors 
weie  onlv  used  lor  die  synthesis  function,  they  would  he  limited  to  synthesi/mg  firm  voices  ilnougli  an  X  bit 
I  )/A  at  an  X  khz  sample  ralo|?,)|.  (In  such  a  system,  the  function  ol  ihc  pioccssoi  is  to  piov  idc  the  appropnatc 
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ii  id  ices  for  a  set  of  table  memories.)  This  method  is  therefore  quite  memory  intensive,  a. id  a  microprocessor 
used  in  this  architecture  could  certainly  not  perform  the  additional  task  of  keyscanning. 

In  order  to  have  a  keyscanner  and/or  more  voices,  one  could  implement  the  system  shown  in  design 
I).  Here,  the  work  of  synthesis  is  divided  up  among  many  processors.  However,  if  a  microcomputer  were 
used  as  the  synthcsi/cr  processor,  it  would  still  be  used  only  for  table  lookup;  and  as  pointed  out  in  the 
introduction,  when  a  microprocessor  is  used  solely  to  execute  simple  algorithms,  the  overhead  becomes 
unacceptably  high.  Due  to  cost  considerations,  using  microcomputers  as  processors  in  this  architecture  must 
be  rejected. 

It  was  therefore  decided  that  design  H  was  the  best  gross  structure  to  use.  Haring  decided  to  use 
design  If  it  was  necessary  to  construct  the  keyboard,  the  processor-controller,  and  die  synthcsi/cr.  as  well  as  to 
implement  the  interface  software,  llic  next  sections  will  discuss  these  aspects  of  die  overall  design. 

5.  i  The  Synthesizer 

A  block  diagram  of  the  synthesizer  is  shown  in  the  "Inexpensive  Synthesizer"  figures.  In  order  to 
understand  the  operation  of  the  device,  we  will  start  by  considering  how  to  best  use  a  binary  rate  multiplier 
(labeled  ll|{M  in  the  'Trequency  Generation  Section"  figure)  for  producing  a  given  system  of  intonation.  To 
do  this,  let's  first  rev  iew  this  and  other  concepts  presented  in  the  first  chapters. 

5.1.1  Producing  Systems  of  Intonation  using  a  Binary  Rate  Multiplier 

In  order  to  perform  most  western  music,  one  must  choose  a  system  oj  intonation,  that  is.  a  fixed  set  of 
pitches,  in  which  to  play  a  given  piece.  I  lie  most  common  system  of  intonation  used  in  western  countries 
today  is  the  equal  tempered  system.  The  equal  tempered  system  of  intonation  consists  of  a  set  of  12  different 
pilch  classes.  The  frequencies  of  the  members  of  a  given  pitch  class  differ  from  one  another  by  factors  of  two. 
If  two  pilches'  frequencies  differ  by  only  one  factor  of  two  (and  are  hence  members  of  the  same  pitch  class), 
they  aie  said  to  be  an  ii7oirnp.nl  m  pilch.  If  all  the  mcmbcis  of  all  the  pile  h  classes  are  grouped  togclhei  and 
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sortcd  in  order  of  frequency,  then  each  group  of  12  pitches  from  the  pitch  class  C  to  the  pitch  class  of  the  next 
higher  frequency  B  is  called  an  octave.  Rach  pilch  in  this  group  is  called  a  member  of  dial  octave.  In  the 
equal  tempered  system  of  intonation,  the  ratios  of  the  adjacent  frequencies  in  the  octave  is  equal  to  the  twellh 
root  of  2.  A  part  of  the  equal  tempered  system  of  intonation  is  shown  below. 


octave 

pitch  class 

frequeneyfh/) 

frequcncy/8372 

9 

C 

8372 

1.00000 

8 

B 

7902 

0.94387 

8 

A# 

7458 

0.89089 

8 

A 

7040 

0.84089 

8 

G# 

6644 

0.79370 

8 

G 

6271 

0.74915 

8 

\:tt 

5919 

0.70710 

8 

i- 

5587 

0.66741 

8 

K 

5274 

0.62996 

8 

l)tf 

4978 

0.59460 

8 

1) 

4698 

0.56123 

8 

C  » 

4434 

0.52973 

8 

c 

4)86 

0.50000 

As  shown  in  the  last  column,  the  scale  can  be  thought  of  having  a  main  note,  die  ninth  octave  C,  from 
which  all  the  other  notes  are  obtained.  Using  this  conception,  one  way  of  obtaining  diis  series  of  pitches  is  to 


make  a  box  lh.it  can  multiply  the  frequency  of  the  nineth  octave  C  by  the  given  fractions  and  produce  the 
resulting  frequencies  as  outputs.  A  HUM  (binary  rate  multiplier)  is  just  such  a  box.  It  has  two  inputs;  a 
frequency  /  and  a  binary  word  v.  It  produces  one  output,  a  frequency  of  value /*  (x/n)  where  0  =<  jt  <  n. 
Ilic  value  of  n  is  a  function  of  the  specific  hardware  implementation,  it  is  usually  a  power  of  2.  Both  x  and  n 


arc  imegeis.  I  he  input  frequency  is  used  to  clock  a  counter  and  the  binary  word  controls  a  state  decoder. 


The  pulses  produced  by  the  BUM  are  not  of  equal  width.  I  lowevcr,  if  the  frequency  multiplication  is 


done  at  a  high  frequency  and  the  resulting  pulse  ti .tin  is  divided  down  to  audio  frequency  using  a  divider 


chain,  the  jitter  is  not  noticible.  I  lie  only  other  synthesi/er  which  uses  this  technique  of  producing  a  high 


frequency  signal  that  is  divided  down  to  remove  jitter  is  the  Northeastern  Digital  Synclavicr  Syn(hesi/cr|27|. 


I  hey  do  not  use  a  BH  M  however. 


I  o  produce  notes  m  the  equal  tempered  scale,  a  set  of  «'s  ,ue  needed.  Suppose  n  is  40%  Mien  one 


-47  - 


method  of  generating  the  v  inputs  that  arc  needed  is  with  the  equation 
-  Kound(40%  *  desired  frequency  /  top  frequency). 

This  idea  can  be  developed  further.  What  is  wanted  is  a  series  of  frequencies  that  are  related  as  in 
the  table  above.  Therefore,  die  top  frequency  input  to  the  HUM  need  not  be  the  top  frequency.  In  fact,  the 
highest  that  a  12  input  BUM  can  multiply  by  is  4095/4096.  so  it  can't  even  pass  the  input  frequency  out. 
What  is  therefore  needed  is  the  "best"  set  of  binary  word  inputs  to  die  BUM.  "Best”  wdl  mean  the  sum  of  the 
squares  of  die  differences  between  the  approximating  and  actual  fractions  arc  .1  minimum.  The  sets  of 
numerators  are  obtained  by  noting  that  the 

(desired  note  multiplier  numerator/40%)/(top  note  multiplier  numerator/4096)  = 

(desired  note  multiplier  numerator)/(lop  note  multiplier  numerator)  = 

(desired  note  frequency)/) top  note  frequency) - 

Utili/ing  the  fact  that  the  frequency  ratios  between  notes  in  any  system  of  intonation  is  fixed,  a 
program  can  search  for  the  "best"  set  of  numerators  quite  simply. 

A  table  of  "best"  numerators  for  use  in  a  binary  rate  multiplier  generating  several  different  systems  of 
intonation  is  shown  below.  Ihc  ratios  for  these  systems  can  be  found  in  |17J.  The  equal  tempered  scale  is 
listed  as  being  "Diatonic"  in  the  reference.  This  is  reflected  in  die  table. 


Intonation 

Notcs/Octave 

fop  numerator 

Sum  of  Squared  terrors 

Sub  Infra  Diatonic 

5 

3994 

1.201  >9 

Infr.i  Diatonic 

7 

3880 

6.751*1-9 

Diatonic 

12 

3902 

1.8517-8 

Supra  1  )iatonic 

19 

3994 

4.05l-:-8 

Just 

12 

2912 

0 

I’ylliugoriaii/Chincse 

12 

4012 

2.0617-8 

Mean 

21 

4064 

5.061  >8 

Mercatorian 

5.1 

4092 

1.8317-7 

/Mi  1  halomc 

256 

4071 

1.1517-6 

I  lie  above  systems  of  intonation  are  quite  interesting.  As  mentioned,  die  I  hnlonic  scale  is  based  on 
I  7  notes  in  the  octave,  each  adjacent  pair  diffenng  from  the  other  by  a  factoi  of  7**(1/12).  I  he  other  diatonic 
scales,  the  Sub  Infra,  the  Inlin  and  the  Su|iia  have  then  adjacent  tones  diffenng  from  one  another  by  the 
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filth.  seventh.  and  nineteenth  roots  ol  two.  1  he  Pythagorean  scale  is  h  iscd  i>a  the  hcqucik  »  ratio  3/2  (,i  pure 
filth).  I  no  Moil. itori. in  stale  produces  a  stale  In  the  same  algorithm  .is  the  Pythagoican  stale,  but  it  tarries  it 
to  52  iterations  as  opposed  to  11.  Hie  Just  scale  is  based  on  the  inters  .its  of  the  pme  fifth  and  the  pure  third 
( wliitli  has  a  frequency  ratio  of  5/4).  The  Mean  tone  stale  is  based  on  a  filth  that  is  slightly  smaller  than  a 
pure  fifth.  I  lie  25(vl)iatonic  entry  is  included  to  show  how  last  the  error  grows  m  the  diatonic  stales  using  a 
12  bit  BRM. 

If  an  18  Mb/  crystal  oscillator  is  used  to  clock  tile  BRM.  and  the  BRM  is  piodiicuig  the  equal 
tcmpeied  scale  using  the  3'X)2  value  for  the  lop  octave  numeraloi.  then  the  American  Standard  Bitch  equal 
tempered  stale  (based  on  a  fourth  octave  A  of  440  lieu/)  will  lie  produced. 

Although  the  BRM  needs  to  be  run  at  as  high  a  frequency  as  possible  to  divide  out  its  inherent  jitter, 
the  synthesizer  multiplexes  it  into  16  oscillators  in  order  to  reduce  chip  count  and  tost.  A  consequence  of  this 
m  the  present  design  is  that  some  timing  constraints  are  violated.  Two  of  (lie  16  multiplexed  voices  don't 
operate  properly  .  It  was  deemed  preferable  to  have  14  voices  instead  of  16  voices  and  more  hardware,  so  the 
design  was  left  in  its  present  form. 

lire  RAM  in  "Frequency  Generation"  figure  contains  the  multiplexed  frequency  multiplication 
numerators.  The  write  controller  updates  their  contents  on  processor  command. 

It  is  assumed  in  this  section  that  a  12  hit  BRM  is  used  to  generate  just  one  octave  of  these  systems, 
flic  lower  octaves  would  be  obtained  by  dividing  the  produced  frequencies  by  2**m  where  m  is  the  number 
of  octaves  below  the  lop  octave  where  the  desired  frequency  lies  lluw  this  is  accomplished  in  practice  is 


described  in  the  next  section. 
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5.1.2  I  lie  Ocktve  Shifter 

\s  mentioned  it  was  assumed  that  the  HUM  pioduced  notes  in  just  one  ni.m1  li  is 

undesirable  to  produce  notes  oxer  an  eight  deluxe  fiequency  i.mgc  using  on!>  1 1  Int-.  of  total  pus  ision.  Ihe 
reason  is  that  under  ideal  conditions,  human  listeners  can  distinguish  a  1  p.ut  in  8  UHI  1 1  ecjnc-iic  >  difference  in 
the  loxxei  octaxes  (around  |M)  h/)| VI |  Therefore,  exen  I.1  hits  ot  iiequeiicc  piecision  pel  ociaxe  are  not 
sufficient  lor  a  high  quality  sound  svnthesi/er.  Ilowexer.  the  remote  sx uilusi/ei  is  not  used  under  ideal 
conditions,  so  12  bits  of  precision  per  octave  were  deemed  Miflicieiu  (i  c  (Hides  piodiiced  by  the  remote 
sx  nthesi/er  do  not  sound  "grainy”  at  the  loxver  octaxes). 

I  he  question  of  hoxx  one  should  change  octaxes  using  a  ItRM  still  lemaius.  I  he  method  used  here 
takes  advantage  of  the  fact  that  shifting  a  binary  number  one  position  to  the  right  in  a  lived  length  binary 
word  divides  the  number  by  two.  If  a  12  hit  code  with  zero  fill  at  the  bottom  is  input  to  a  HUM.  a  ficqucncv  / 
will  be  produced.  If  this  12  hit  code  is  shifted  down  one  position  with  /cm  til)  at  the  top.  il  will  produce  the 
frequency  f/2.  lhus,  an  IS  bit  wide  HUM  can  produce  any  frequency  specified  to  12  bits  oxei  a  7  octave 
range. 

If  the  frequency  word  is  shifted  one  position  more  after  it  reaches  the  bottom,  the  least  significant  bit 
will  be  lost.  So  one  can  use  a  HUM  to  synthcsi/c  frequencies  oxer  an  S  octave  range  if  one  is  willing  to  have 
only  1 1  bits  of  precision  in  the  lowest  octave. 

I  he  function  of  the  "Moating  to  fixed  Converter"  is  to  perform  this  shifting  operation  on  the  12  hit 
frequency  word.  I  Ins  hardware  is  equivalent  to  the  aligning  hardware  used  in  some  floating  point  machines. 

I  lie  shifting  was  done  in  hardware  because  the  controller  is  an  8  bit  machine  and  does  18  bit  shifting  poorly 

Ihe  major  pails  of  the  frequency  generation  section  have  now  been  completely  des.  nbed  In  actual 
operation,  the  piocessoi  tells  the  synlhesi/er  what  pilch  to  play  using  the  liequency  select  input,  what  octave 
to  play  it  in  using  the  octave  select  input,  and  which  voice  to  play  it  from  using  the  voice  select  input  Alter 


the  data  is  set  up.  the  processor  sends  a  "Ready"  signal  to  the  synthesizer  I  lie  floating  to  fixed  i  onxerler  then 
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aligns  the  12  hit  woid  and  the  write  controller  wines  the  aligned  woul  uito  the  appropriate  voice.  1  "lie  4  hit 
countei  multiplexes  the  ItKM  be  processing  each  voice  through  it  and  then  changing  the  HUM  s  Nt.itc  with  the 
e,n  i  v  output  line  I  his  produces  the  1(>  multiplexed  voices  on  the  UK  M's  output. 

5.1  J  An  Aside:  \  SI  SI  Synthesizer 

\  v. m. nit  i  t  the  inothci  scpuie  wave  sy  thesis  algonthm  w.is  implemented  in  VI  SI  for  ait  Ml  I  6.371 
\  I  SI  mirisc  tom  piojiM  \  Mock  di.igiam  ol  the  piojext  is  shown  m  the  ligtitc  l.iheled  "A  VI  SI  I- requeue  y 
Sv  nthesi/er". 

rile  pio|x\t  consists  ol  eight  (frequcinv)  piograiimiublc  oseill.itois  which  multiplex  their  outputs 
onto  .i  single  output  line  I  Ins  is  snml.ir  ti>  the  .ihovc  design.  howe»ei  here  e.ich  oscillator's  output  frequency 
is  determined  In  the  rel.uion  font  fclock/\*X  (  I  <  N  <  3**30  ) 

I  he  des.gn  c  onsists  ol  three  in.iin  parts. 

!  he  d.it.i  led  serially  into  the  m.iclnne  limn  the  left.  I  his  is  done  Is  .ipplvmg  ,i  hit  of  data  to  the 
I >  \  I  VINI’t  I  p  ul  and  taking  the  I >  \  I  \  IIII  KI>\  input  lugh.  I  he  high  going  edge  is  detected  using  a 
disc  ictc  logii  i  dgi  detei  lui  I  he  output  ol  the  edge  delec tot  is  used  lo  shill  oi  hold  the  the  serial  input  data 
in  ih-‘  v<  n,il  m/'iii  i  I  he  hillv  loaded  tegisti  i  contents .  (insist  of  three  hits  of  coniiol  information  which 

point  to  the  i  oi,  ■  h  i  i  i  he  loaded  ah  mg  with  ’()  hits  of  liequcm  v  data  w  In.  h  spec  il'v  the  new  contents  of 

the  nulicated  '  nice  icgisier  Sciial  as  opposed  to  parallel  loading  was  used  lo  save  pads  and  pad  space. 

Mlei  the  suial  input  tegistei  is  loaded,  the  l>\l  VWOUI)  KI)N  signal  is  i.used  On  reacting  litis 
signal  the  io.’m  umi'flltt  I ’I  \  waits  until  the  pi  ope  i  voice  rogistei  (calculates  to  (he  input 

mil  lipii  os  md  th.  u  loads  lire  (he  legistei  with  ihe  new  ''(I  Ml  liequen.  v  .void  (  I  In  I’l  \  normally  allows 
ill.  old  voice  icgistei  conienls  |o  iec uciiljlC.) 

I  In  pal. ill.  I  oulpiit  ol  (he  con  c  ng’islei  section  l-vds  a  VI  n,  ln<  ’in  i.  c  ,iu  in  nuiiilir  sect  It  ill  wlllcll 
t  ilniost)  looks  like  t  it  ■  iic nl  i! ini*  -inli  leitista  I  In  c  u  In  down  i  mmieis  io  e.ili  ed  using  a  two  input  xor 
gate  Iim  li  im-e  ihe  m  as  st.iie  ml  a  ’ll  input  N(  >l<  gaU  !  .  delei  t  lln  /eiolh  stale  I  Ins  Nt  )K  gate 


causes  the  voice  register  section  to  reload  the  count-down  register  afresh  each  time  the  /c.o  state  is  detected. 
1  his  NOR  gate  also  drives  the  multiplexed  frequency  output  pad. 

There  are  a  number  of  reasons  for  not  using  this,  or  similar.  VI  SI  designs  to  implement  the  remote 
synthesizer.  One  is  cost.  A  mass  production  run  for  this  chip  would  cost  approximately  $10,000  to  produce 
500  chips.  It  would  yield  100  to  200  good  chips  .it  a  cost  of  $50  to  $100  each.  The  chip  cost  of  die  equivalent 
part  of  the  discrete  design  is  currently  about  $25.  (Also,  this  project  cannot  currently  tilili/e  100-200  lCs.) 

Another  reason  is  speed.  An  NMOS  1C  using  the  design  methodology  of  this  chip  cannot,  as  of  this 
writing,  be  clocked  as  last  as  discrete  I’l  l .. 

finally,  the  1C  design  is  not  as  expandable  as  the  discrete  design.  Tor  example,  it  is  relatively  easy  to 
change  die  discrete  design  to  strobe  out  a  waveform  memory  exen  after  the  hoards  arc  built.  It  is  currently 
impossible  to  "retrofit"  an  1C  as  signals  needed  from  the  chip's  mechanism  may  not  be  available  on  the 
output  pads. 

5.1.4  Amplitude  Modulation 

I  lie  current  remote  synthcsi/cr  design  has  no  amplitude  modulation  capability  yet.  It  simply 
demultiplexes  the  ”16  voice  multiplexed  output"  shown  in  the  "f  requency  Generation"  figure  into  14 
separate  voices  divides  them  down  eight  octaves  and  then  adds  diem  together  dirough  an  analog  summer. 
I  lowever.  two  different  amplitude  modulation  schemes  have  been  devised  for  this  synthesizer. 

The  first  method  is  shown  in  the  figure  labeled  die  "Rom  Amplitude  Modulation  Section".  Here,  the 
output  frequency  produced  by  the  BRM  is  used  to  clock  a  counter  which  accesses  a  256  element  waveform 
memory  table.  (This  signal  w.is  simply  divided  down  to  a  lower  frequency  square  wave  in  the  previous 
design.)  Naim. illy,  the  output  waveform  fiom  the  table  will  occur  at  l/.’Vuh  of  the  frequency  fed  to  the 
counter.  Any  waveform  can  be  used,  and  there  could  he  several  di f/eicnt  ones  in  the  table.  I  he  output  wave 
w  ill  have  jitter  at  the  higher  harmonics,  hut  this  jitter  will  not  be  mineable  in  the  audio  range  |2/|  Since  this 
svntliesi/er  is  not  a  fixed  sample  rate  system,  (lie  table  can  be  as  shod  as  256  enliies  long  and  still  perform 


holier  than  a  fixed  point  system  table  lookup  scheme  using  interpolation  on  a  I K  memoiy[27].  flic  reason  for 
this  is  that  a  variable  sample  rate  system  hits  every  table  entry  on  a  table  entry  boundary  as  n  increments 
through  the  memory. 

The  multiply  operation  indicated  in  the  figure  is  obviously  digital,  although  there  aie  advantages  to 
making  it  analog,  for  instance,  by  cascading  three  8  bit  multiplying  ITU's,  one  controlled  by  the  wave  table, 
one  controlled  by  the  amplitude  register  and  one  controlled  by  an  envelope  generator,  one  could  obtain  die 
equivalent  of  a  24  hit  l)AC.  This  method  is  used  in  the  Northeastern  Digital  Synciavier.  I  he  real  trick  in 
doing  this  is  to  have  a  good  set  of  normalization  procedures  to  keep  all  of  the  DACs  filled. 

I  he  final  step  in  the  figure  has  the  adder  mixing  the  voices  together  before  they  are  passed  out  to  die 

DAC. 

A  second  method  of  amplitude  modulation  is  illustiated  in  the  figure  labeled  "Square  Wave 
Amplitude  Modulation  Section",  fliis  method  requires  less  hardware  than  the  first,  but  is  more  limited. 
Instead  of  modulating  the  amplitude  of  an  arbitrary  waveform,  it  amplitude  modulates  the  square  waves 

output  from  the  UK  M.  One  can  do  tins  with  AND  gates  instead  of  multipliers  because  square  waves  are  dual 

•  1 

valued  functions.  Multiplying  square  waves  by  a  given  value  is  equivalent  to  switching  that  value  on  and  off 
at  the  frequency  which  the  square  wave  is  oscillating.  I  he  modulated  square  waves  are  summed  and  output 
as  before.  , 

5.2  The  Keyboard 

I  he  keyboard  was  bought  commercially.  It  has  a  S  octavo  ranee  (hi  keys)  and  has  two  SI’S  I  switches 
per  key.  All  SCSI  switches  connect  to  a  common  ous.  Ibis  gives  the  keyboard  the  capacity  to  be  velocity 
sensitive.  I  bis  option  is  not  curicnllv  used,  although  il  it  were  implemented,  the  measurement  ei rnr  could  be 
veiv  great.  Sampling,  si  sovcisl  points  during  key  dcpicssion  would  eliminate  much  of  this  ertor  on  a 
multlswikh  keyboard,  but  this  keyho.ud  does  not  have  that  capability. 

It  is  interesting  to  note  that  tbeic  are  only  three  significant  p.iiameteis  that  a  keyho.ud  pcifoimcr 
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scmJs  lu  mans  keyboard  instruments.  They  are  liic  key's  velocity,  the  time  of  string  imj  act  and  the  time  of 
string  release.  I  hcrcforc,  things  like  key  acceleration  need  not  he  recorded  as  a  performance  parameter. 

During  operation,  the  common  bus  is  grounded  and  the  switches  are  connected  to  a  64  input 
multiplexer.  This  allows  die  controller  to  randomly  access  each  key.  There  is  no  debouncc  circuitry  on  the 
sw  itches  or  the  multiplexer.  The  debouncc  operation  is  performed  in  software. 

5.3  The  Controller 

The  controller  is  the  device  which  glues  all  die  parts  together. 

A  block  diagram  of  the  controller  is  shown  in  the  figure  labeled  "Control  Hoard".  There  were  dircc 
reasons  for  designing  a  controller  board  as  opposed  to  using  an  off  the  shelf  system.  They  were  si/e.  cost  and 
parallel  i/O  capability.  'Hie  processor  is  mounted  on  an  8  1/2”  x  5"  card.  It  is  cheaper  than  any  oilier 
commercially  available  card  of  equivalent  power.  As  shown,  die  synthesizer  and  keyboard  require  four  8-hit 
ports  of  parallel  input  to  control.  More  parallel  I/O  is  needed  if  die  system  wants  to  send  the  parameters  it 
records  in  parallel  to  some  device. 

The  processor  can  act  as  a  slave  to  a  remote  system,  rccicving  note  information,  encoding  it  and 
playing  it.  However,  its  primary  purpose  is  to  act  as  a  keyboard  scanner,  sending  performance  update 
information  to  die  storage  device  and  simultaneously  playing  the  notes  struck  on  die  keyboard  through  the 
sy  nihesi/er.  It  can  communicate  with  the  storage  device  through  a  parallel  or  serial  port.  Only  die  serial  port 
has  been  used  for  this  function  so  far.  Serial  I/O  allows  one  to  record  the  performance  parameters  on  a 
cassette  tape  and  send  them  to  a  modem  or  directly  to  a  mainframe  machine. 

Since  the  amplitude  modulation  section  of  the  remote  sythesi/.cr  lias  not  been  built  yet,  no  exact 
claims  can  he  made  about  how  well  (lie  processor  simultaneously  updates  the  synthesizer,  scans  the  keyboard 
and  transmits  performance  parameters.  However,  the  synthesizer  described  in  the  next  chapter  was 
controlled  using  this  controller,  and  it  was  able  to  update  four  groups  of  three  12  hit  amplitude  envelopes 
while  connected  to  an  N-kcy  rollover  input  device.  The  currently  implemented  device  handles  all  of  its  tasks 
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casily  using  itic  algorithm  described  in  the  next  section. 

5.4  The  Software 

1  he  coni  roller  must  perform  three  tasks.  First,  it  must  scan  the  keyboard  and  eliminate  the  key 
bounce  and  contact  noise  it  encounters.  Second,  it  must  route  the  pressed  keys  to  the  correct  oscillator  voices. 
Third,  it  must  send  the  performance  parameters  to  the  storage  device. 

1  here  are  two  ways  to  implement  the  keyboard  scanning  algorithm.  The  first  and  most  obvious  is  to 
use  a  dedicated  piece  of  hardware.  However,  this  would  entail  additional  cost,  as  no  small,  cheap  kcyscanner 
is  currently'  available.  (Another  problem  is  that  none  of  the  organ  keyboards  considered  could  be  easily 
converted  to  have  a  key  matrix  output,  which  dedicated  keyboard  interface  chips,  such  as  the  Intel  8279, 
require.)  It  was  therefore  decided  to  let  the  synthesizer  controller  also  perform  the  keyscanning  algorithm. 

I  he  routing  algorithm  scans  M  keys  and  assigns  the  played  keys  to  N  oscillators.  It  is  not  allowed  to 
ovei w  rite  oscillators  until  they  have  finished  playing  a  note.  It  is  not  allowed  to  assign  the  same  played  key  to 
two  oscillators,  (i.e.  the  mapping  between  played  keys  and  oscillators  is  one-to-one  and  onto.)  This  being  the 
case,  if  more  than  N  key  s  are  hit,  the  algorithm  will  not  allow  the  excess  keys  to  he  played. 

The  performance  parameter  transmission  algorithm  simply  packs  values  in  a  queue  and  then  unpacks 
and  sends  them  when  the  processor  is  free. 

All  the  algorithms  are  performed  in  linear  time.  I  lie  routing  algorithm  is  reported  here  because  all 
previous  algorithms  considered  were  O(MN)  or  required  a  doubly  linked  list  or  a  stack. 

I  he  figure  labeled  "The  Key  scan  Algol  ithm"  shows  intermediate  stages  of  the  algorithm.  It  reflects 
the  lad  that  there  aie  two  sliuciiires  associated  with  it:  a  keyboard  scan  list  ( K S I  )  and  an  oscillator  note  list 
(ONI  ).  I  he  KSI  Hulk. lies  if  a  kev  is  pushed  down.  I  .ich  mcmlk  i  ul  ill  ONI  icpiesents  an  oscillator  and 
points  to  the  current  kev  that  that  osi  illatoi  is  playing.  In  the  Figme,  the  letters  beneath  the  keyho.tid 
Hulk. lie  wliii  h  keys  being  piessed.  Iluvblack  marks  on  the  keys  indicate  which  cntnes  m  the  KSI  are  turned 
on.  I  he  blue  k  squares  above  the  ( )S!  are  I  bit  lags  used  in  the  com  sc  oflhc  algoi  ithm. 
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I'he  algorithm  starts  by  scanning  the  ONI..  Any  "0"  entry  indicates  that  th.it  os  illator  is  currently 
playing  a  rest,  so  "0"  entries  are  skipped.  If  the  ONI  entry  is  not  it  is  marked  and  a  check  is  made  to  see 
if  the  key  it  points  to  is  still  being  pressed.  If  it  is,  die  KSI.  entry  designating  that  key  is  marked.  If  it  is  not, 
the  oscillator  register's  value  is  put  in  a  queue  along  with  the  key  release  time  (this  information  is  required  by 
the  performance  parameter  transmission  algorithm)  and  the  ON  I  entry  is  assigned  the  value  "0".  After  die 
ONI  is  completely  scanned,  die  lice  register  pointer  is  then  set  to  point  at  the  first  oscillator  register. 

Next  the  keyscan  algorithm  (KA)  is  called.  It  simply  scans  the  keyboard  from  left  to  right  and  reports 
keys  that  are  being  pressed  (it  may  also  mistakenly  report  noise  as  a  keypress).  If  die  KA  reports  a  key  that 
the  KSI  has  marked,  we  know  that  the  register  scan  has  just  examined  it.  so  we  simply  unmurk  that  KSI. 
entry  and  go  on.  If  die  KA  reports  a  key  dial  is  not  marked,  we  assign  the  note  to  the  first  unmarked  (hence 
unassigned)  oscillator  register.  Iliis  is  done  by  advancing  die  free  register  pointer  until  it  is  pointing  to  an 
unmarked  location  and  then  filling  dint  location.  When  the  KA  finishes  scanning  the  keys  or  the  free  register 
pointer  increments  past  die  last  ONI.  entry,  then  the  allocation  algorithm  is  done.  I  he  marked  non-"0"  OSI. 
entries  are  keys  that  were  pressed  down  and  are  still  pressed  down.  The  unmarked  non-"0"  entries  arc  new 
keys  to  be  played  (or  are  noise),  flic  marked  "0"  entries  are  keys  that  have  been  released.  I  he  unmarked  "0" 
entries  are  free  registers  where  the  oscillator  is  playing  a  rest.  Next  performance  parameter  transmission 
algorithm  outputs  the  queued  values  for  10  ms.  This  output  loop  provides  the  debounce  time  necessary  to  let 
the  keys  settle. 

Finally,  die  oscillators  arc  updated.  'Iliis  is  done  by  scanning  die  ONI.  registers.  If  the  entry  is 
unmarked  and  the  key  it  is  pointing  to  is  not  pressed,  then  die  register  is  deallocated.  I  liis  case  would  occur 
in  practice  if  there  had  been  noise  on  the  key  input.  Again,  this  is  the  reason  for  the  10  ms  wait.  If  die 
unmatked  register's  key  is  being  pressed,  then  the  key  is  played  and  queued  along  with  its  "stait  time".  If  the 
contents  of  the  legister  indicate  a  rest  and  the  legister  is  m.iiked,  then  a  rest  is  played  and  the  register  is 


unmarked  and  deallocated. 
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5.5  Conclusions 

'I Tic  system  is  adequate,  reasonably  portable  and  cheap.  It  can  probably  be  packaged  into  the 
keyboard  case,  and  this  will  be  tried.  The  storage  device  is  the  bulkiest  part.  Until  die  technology  evolves  to 
die  point  where  cheap  portable  mass  storage  is  a  reality,  die  machine  will  never  be  totally  satisfactory  as  a 
portable  music  typewriter.  Perhaps  bubble  memory  will  solve  this  problem  someday. 

1  ack  of  amplitude  modulation  and  the  limitation  of  the  waveform  to  square  waves  was  viewed  as 
being  a  problem  by  some,  but  not  by  others.  Such  a  view  is  apparently  dependent  on  what  type  of  music 
synthesis  hardware  die  person  had  worked  on  before.  Amplitude  Modulation  hardware  will  probably  be 
added  to  the  design  eventually. 

The  true  purpose  of  this  hardware  is  to  provide  a  cheap  device  for  music  composition  research.  It 


serves  this  purpose  well. 
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6.  Chapter  Six:  ITieme  and  Variations  on  a  Digital  Signal  Processor 
6.1  If  God  has  His  mathematics,  let  Him  do  it 

In  constructing  the  main  digital  signal  processor  two  choices  were  made.  The  first  was  to  decide  what 
model(s)  ofsign.il  generation  the  processor  should  be  geared  toward.  A  processor  embracing  main  models  of 
generation  would  be  much  more  complex  (and  expensive)  than  a  processor  based  on  a  small  set  of  models. 
Secondly,  the  implementation  mechanism  was  chosen. 

The  models  considered  can  be  broken  down  into  two  classes:  models  using  non-linear  signal 
generation  techniques  and  models  using  linear  signal  generation  techniques.  Models  using  linear  signal 
generation  techniques  differ  from  one  another  in  the  sets  of  orthonormal  functions  used  to  approximate  a 
given  waveform.  The  most  common  among  these  methods  is  sine  summation  synthesis,  which  is  sometimes 
called  l-'ouricr  synthesis.  Another  interesting  set  which  is  briefly  discussed  here  is  Walsh  function  synthesis. 

Non-linear  signal  generation  techniques  arc  not  as  well  understood  as  linear  systems.  Also,  the  set  of 
functions  that  they  can  synthesize  is  usually  not  complete.  That  is,  there  exist  periodic  functions  which  cannot 
be  synthesized  by  some  non-linear  methods.  The  advantage  in  using  such  methods  is  that  they  arc  usually 
very  cheap  computationally.  One  of  the  few  methods  in  wide  use  is  called  I'M  synthesis.  It  synthesizes 
sounds  by  varying  the  parameters  to  the  KM  Kquation  (Sec  entry  III  in  the  figure  labeled  "Synthesis 
I  unctions  Used  in  Music").  Other  models  were  considered  but  won’t  be  discussed  here. 

The  main  idea  behind  constructing  the  digital  signal  processor  was  that  it  be  sophisticated  enough  to 
fulfill  most  people’s  needs  and  yet  not  so  sophisticated  that  people  can’t  understand  it.  Therefore,  models  of 
sound  production  were  chosen  that  were  reasonably  well  known.  Those  models  and  their  implementations 


aie  discussed  below. 
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6.2  A  Design  Based  on  Synthesis  by  Walsh  Functions 


One  mechanism  of  sound  production  considered  w.is  Walsh  function  synthesis.  Walsh  functions  arc 
a  set  of  orlhonormal  functions  that  take  on  the  values  -1  and  1  in  the  (0.1)  interval.  When  plotted,  they  look 
like  a  set  of  square  waves  (Rademacher  functions)  with  duly  cycles  that  change  through  the  period  of  die 
funclion|31).  Walsh  functions  contain  square  waves,  and  have  better  convergence  properties.  I  cample  Walsh 
(unctions  are  shown  in  figure  labeled  "The  f  irs;  fight  Walsh  f  unctions".  As  can  he  seen,  Walsh  functions 
are  ordered  by  the  number  of  zero  crossings  per  unit  interval  that  they  possess.  The  ordering  numbei  is  also 
called  the  function's  sequent')'. 

The  bivalued  amplitude  of  Walsh  functions  makes  them  particularly  well  suited  for  leal  time  digital 
waveform  synthesis.  Recall  that  in  sine  summation  synthesis,  one  must  multiply  each  sine  wave  by  some 
weighting  value.  Since  the  sine  wave  takes  on  continuous  values  in  the  unit  interval,  this  requires  that  a  true 


multiplication  be  done  when  performing  the  weighting.  Multiplication  is  an  expensive  operation  to  do  in  real 
time.  However,  since  a  Walsh  function  only  assumes  two  values,  the  weighting  operation  can  only  produce 
two  values  for  any  weighted  input.  These  values  are  either  the  weight  or  its  negation.  So  in  practice,  the 
multiplication  is  replaced  with  a  "Complement/Nol  Complement"  operation.  Ibis  is  an  inexpensive 
operation  to  perform  in  real  tunc. 

What  remains  is  to  compute  the  l  and  I  values  for  the  Walsh  functions  in  the  unit  interval.  Ihe 
follow  ing  fac  t  gives  a  simple  method  for  doing  this.  Ihe  nuny  eJpes  pnnluceti  by  a  hi  nan  nite  multiplier  with 
input  i  uiilmilr  when  the  :ern  i  ro.wmgv  o /  the  ith  Ifa/v/i  Junction  in  cur  in  the  unit  interval. 

Iherefoie.  if  a  IIKM  is  followed  by  a  "I"  thpllop,  the  output  of  the  tliptlop  produces  the  set  of 
Walsh  functions  exactly  (multiplied  by  either  I  I  oi  I  depending  on  the  initial  stale  of  the  llipllop  as  it 
detects  the  first  edge).  When  viewing  the  waveforms  produc  ed  In  such  a  c tic  nit.  one  inlcipicls  the  llipllop's 


"/cro"  state  as  representing  1  and  the  "one”  state  as  representing  t  1 . 


In  tlie  Amplitude  Modulation  section  of  the  last  Chapter,  the  IIKM  was  viewed  as  a  method  of 
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producing  jittery  square  waves.  Here,  (combined  with  the  integrating  (lipflops)  it  is  viewed  as  a  cheap 
generator  for  Walsh  functions.  This  alternate  interpretation  of  the  HR \f  makes  it  a  much  more  powerful 
dev  ice  than  before. 

Used  as  a  Walsh  function  generator,  the  BUM  can  be  thought  of  as  a  table  memory  with  an  extra 
input.  This  input  chooses  which  Walsh  function  is  currently  being  st robbed  out  of  die  memory.  Ity 
combining  a  set  of  these  functions,  each  appropriately  weighted  and  accessed  at  die  same  frequency,  it  is 
possible  to  generate  any  wavefoem  of  that  frequency. 

A  Walsh  function  synthesi/er  that  would  perform  algorithm  "I"  shown  in  die  "Synthesis  functions 
Used  in  Music"  figure  was  designed  on  paper.  A  block  diagram  of  the  machine  is  shown  in  die  "\kalsh 
function  Synthesizer"  figures.  Note  the  similarity  between  these  diagrams  and  the  "Inexpensive  Synthesizer" 
figures. 

The  reason  for  this  similarity  is  that  since  a  BUM  can  be  used  for  producing  Walsh  functions,  the 
anvil  produced  for  the  inexpensive  synthesizer  can  be  used  to  generate  the  H’alsli  Junction  derivatives  in  a 
Walsh  function  digital  signal  processor. 

Ihc  task  of  selecting  an  oscillating  frequency  for  die  BUM  output  functions  in  die  "Walsh  Waveform 
Generation  Section"  is  done  by  using  a  second  BUM  (labeled  BRM-2).  The  output  frequency  of  BRM-2  is 
specifiable  to  24  bits.  If  desired,  a  "floating  to  fixed  Converter"  could  be  used  to  Iced  the  binary  word  input 
into  BRM-2. 

Unfortunately,  the  output  of  BRM-2  must  be  viewed  as  a  jittery  square  wave.  The  1/n  counter 
would  be  used  to  remove  some  of  the  jitter  before  this  signal  was  used  to  clock  the  Walsh  f  unction  ( ieneralor 
unit. 

the  jitter  problem  is  critical  to  the  design.  Walsh  function  synthesis  seems  to  trade  olf  the 
in  format  ion  storage  in  amplitude  lor  information  storage  m  sequency.  (An  analogy  useful  in  understanding 
this  is  to  recall  the  difference  between  AM  and  I'M  encoding  of  information)  Since  the  jitter  would  coirupt 
the  duty  cyile.  it  would  seveiely  affect  the  information  stored  in  it  Jitter  is  removed  by  making  the  n  in  the 
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1/n  divider  very  large.  However,  if  n  was  made  too  large,  then  die  HUM -2  method  would  produce  a 
fundamental  frequency  of  loo  low  a  value  to  be  acceptable.  In  this  case  another  method  of  frequency 
generation  would  have  to  be  substituted. 

After  the  Walsh  function  derivatives  were  produced,  they  would  have  to  be  integrated  and 
"multiplied"  by  their  coefficients.  The  method  for  doing  this  is  shown  in  the  "Coefficient  Weighting  and 
Amplitude  Modulation  Section"  figure.  Note  the  similaiities  between  this  figure  and  the  "Square  Wave 
Amplitude  Modulation  Section"  figure.  The  "Two's  Complementer"  in  the  Walsh  synlhcst/er  could  he  as 
simple  as  a  row  of  "Kxclusive  Or"  gales  if  the  error  produced  by  not  adding  1  in  the  two's  complement 
algorithm  is  small.  Next,  the  Walsh  "harmonics"  are  summed  and  multiplied  by  an  amplitude  envelope. 

I  mforiun.iiely,  people  are  currently  taught  to  view  signals  in  terms  of  frequency  as  opposed  to 
sequency  Ihcrelbrc  this  digital  signal  processor  design  would  he  harder  for  potential  users  to  use.  Another 
disadvantage  is  that  each  block  of  Walsh  functions  is  tied  to  a  particular  hamumie  frequency.  Also,  if  live 
synthesizer  is  not  used  to  generate  general  purpose  waveforms,  but  just  to  generate  a  small  number  of  sine 
waves,  then  the  Walsh  design  is  inefficient  finally,  and  most  important,  although  the  Walsh  function 
synthesizer  can  do  the  weighting  without  resorting  to  multiplication,  it  cannot  (as  far  as  I  can  tell)  resort  to  any 
trick  to  do  the  amplitude  envelope  multiplication.  Standard  multiplication  techniques  must  he  used  here. 

6.3  A  Design  Bused  on  IM  and  Sine  Summation  Synthesis 

I  he  digital  signal  processor  constructed  lor  tins  project  was  based  on  ideas  proposed  by  Snell  |  'S|  and 
W  ard| .!()).  I  he  data  flow  paths  are  shown  in  the  "I  og  Synthest/er"  figures,  l  ive  numerals  hounded  on  each 
side  by  an  asteiisk  are  the  control  points  I  lie  numcials  labeling  live  slashed  lines  indicate  the  width  of  data 
paths,  live  heavy  dark  lines  rcpicscnt  pipeline  icy  islets  live  machine  is  hot  i/oni. ills  uni  uwoded  and.  as  can 
be  seen,  fully  pipelined.  It  is  piimanly  designed  to  do  sine  summation  svnllusis  using  log  table  lookup  to 
perform  the  required  multiplication.  All  random  access  mcntoiv  is  interleaved  ( lute a  leaved  memory  permits 
odd  memory  locations  to  be  wniten  as  even  locations  are  tend  and  vice  versa  )  I  his  is  icpiescntcd  bv  pairs  of 
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boxes  between  pipeline  registers.  The  feedback  path  allows  the  machine  to  do  real  time  I'M  sy nthesis|.10]. 
Other  similar  methods  of  synthesis  can  be  implemented  by  using  the  tables  to  lookup  other  functions. 

The  data  flow  of  die  machine  is  quite  simple.  Ignoring  die  I  M  input  for  the  present,  we  see  that  the 
"I’hase  Update"  figure  simply  performs  the  assignment  operation  A  =  A  +  ll  where  A  is  the  Phase  Memory 
and  II  is  the  Phase  Increment  Memory.  This  loop  updates  the  phase  information  for  each  voice.  The  quantity 
stored  in  die  Phase  Memory  is  viewed  as  being  a  two’s  complement  number  in  the  sine  and  PM  algorithms. 
I'lie  Phase  Memory  contents  are  given  as  an  argument  to  the  log(sin(x))  lookup  function  (treatment  of 
negative  numbers  will  lie  discussed  later).  The  result  of  this  lookup  is  added  to  die  logs  of  the  P'nvelope  and 
Coefficient  quantities  and  the  antilog  of  the  result  is  taken.  Since  addition  in  the  log  domain  is  equivalent  to 
multiplication  in  the  non-log  domain,  the  antilog  operation  produces  the  product  of  the  Envelope  term,  die 
Coefficient  term  and  the  sine  term.  These  can  be  summed  in  die  digital  mixer  or  added  to  die  Phase  Memory 
quantity  via  the  feedback  loop.  If  die  latter  option  is  chosen,  then  PM  synthesis  can  be  performed,  finally, 
the  accumulated  results  can  be  clocked  out  through  a  DAC. 

Multiplexed  into  32  oscillators,  die  present  implementation  can  synthesize  the  functions  shown  as  II 
and  III  in  the  "Synthesis  functions  Used  in  Music"  figure.  It  could  potentially  synthesize  functions  IV  and  V. 
Using  the  256  oscillator  version,  one  could  increase  the  upper  index  on  the  sums  by  a  factor  of  8.  In  the  table, 
ai.  hi.  and  ci  represent  weighting  coefficients.  \i.  Hi  and  Ci  arc  the  amplitude  envelope  multipliers.  Note  that 
they,  unlike  die  frequency  terms,  do  not  have  a  time  term  associated  with  diem.  This  is  because  the 
amplitude  envelope  is  updated  by  the  controlling  processor,  not  the  synthesizer. 

An  alternative  to  having  die  processor  update  every  sample  in  the  envelope  waveform  is  to  model  the 
envelope  as  being  a  piecewise  linear  function.  Ilicn  the  processor  could  simply  give  the  synthesizer  "rate" 
and  "limit"  information  which  is  fed  to  an  interpolation  algorithm  in  hardware.  Although  this  method  is 
clearly  superior,  it  was  not  implemented  in  the  prototype  due  to  space  considerations. 

I  he  present  design  has  a  500  ns  pipeline  clock  and  implements  32  oscillators.  Plus  produces  a 
sampling  speed  of  6?  Khz.  The  speed  of  the  table  lookup  memory  causes  the  pipeline  bottleneck  When 


-62- 


parts  (Mostck  MK4802(P)  2K  x  8  Static  Rams  with  55ns  access  time)  become  available,  the  table  lookup 
memory  can  be  replaced  and  the  number  of  oscillators  can  be  expanded  to  256.  This  would  produce  a 
sampling  speed  of  39  Khz  with  a  pipeline  clock  of  100  ns. 

6.4  Multiplying  Using  Log  Arithmetic 

The  most  unusual  part  of  the  synthesizer  algorithm  is  its  method  of  multiplication. 

Multiplication  is  important  in  sound  synthesis.  Algorithms  described  in  the  sound  synthesis  literature 
involve  a  variety  of  complex  operations:  functional  composition,  convolution,  weighting,  etc.,  hut  in  actual 
implementations,  all  practical  real  time  digital  sound  synthesis  systems  investigated  could  not  perform  their 
sound  synthesis  algorithms  without  a  multiplication  operation  (Unless  the  synthesis  was  done  totally  with 
table  lookup(29|).  This  multiplication  has  been  done  in  the  analog  domain|27|,  with  digital  multipliers(35|,  or 
by  interpolating  table  lookup[26],  (And  it  can  be  done  with  some  functional  limitations  using  the 
"complement/not  complement"  operation  in  the  Walsh  domain  as  explained  above.) 

1'hcrc  are  a  number  of  ways  that  multiplication  could  be  performed  in  this  design.  One  method  uses 
simple  table  lookup.  The  sine  table  could  output  a  6  bit  wide  sign  magnitude  representation  of  the  sin(x) 
function,  the  controlling  processor  limit  the  amplitude  envelope  to  a  total  of  6  bits  and  the  digital  signal 
processor  could  use  the  combined  12  bits  to  address  a  multiplication  table.  The  sign  bits  would  be  xored  to 
supply  the  13th  bit  to  the  table.  'Ihc  mtiliplicalion  table  would  also  make  the  sign  magnitude  to  two's 
complement  conversion.  (Note  that  the  processor  would  not  have  two  amplitude  envelopes  with  this 
method.)  I  he  output  would  have  a  total  of  14  bits  and  have  ail  associated  noise  equal  to  dial  of  the  sine  table, 
which  is  41  dU|3.’|.  I  herefore,  any  method  that  we  propose  should  have  a  signal  to  noise  ratio  of  less  than  44 
dll  in  older  to  outperform  (his  method.  Hut  wait,  is  this  really  an  intellegent  approach?  Ignoring  the  two 
amplitude  envelope  question,  we  see  that  the  amplitude  ratio  of  the  largest  geueratable  signal  to  the  smallest 
pcncratahlc  signal  (i.c.  the  dynamic  range)  is  64  This  is  insufficient  Tor  our  needs.  Two  things  desired  of  the 
svnthesi/cr  are  low  noise  and  a  teasonable  dynamic  t angle,  but  we  are  willing  to  make  tradeoffs  between  the 
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two. 

The  compromise  used  here  is  fixed  point  log  arithmetic.  It  does  not  give  as  low  a  noise  value  as  some 
methods,  but  it  does  give  a  higher  pipeline  speed  than  any  other  method  along  with  a  reasonable  dynamic 
range.  In  practice,  die  logarithms  of  die  absolute  value  of  the  multiplier  and  multiplicand  are  added  together 
and  the  result  exponentiated.  The  parity  of  die  result  is  the  sign  of  the  sine  multiplier.  The  multiplicand  is 
the  sum  of  the  coefficient  and  envelope  logs,  both  the  coefficient  and  envelope  are  assumed  to  take  on  only 
non-negative  values. 

The  )og(sin(x))  table  contains  the  first  1X0  degrees  of  the  log(sin(x))  wave.  The  top  11  bits  (not 
including  the  sign  bit)  of  die  I’liase  Memory  arc  used  to  address  die  table.  It  is  not  necessary  to  take  die  two’s 
complement  of  any  negative  number  before  it  is  input  to  the  table  because  sin(-x)  ~  -sin(180-x).  Note  dial 
even  storing  a  half  cycle  of  die  sine  wave  does  not  make  optimal  use  of  table  space  since  all  die  information 
about  a  sine  wave  is  contained  in  the  first  90  degrees  of  its  cycle,  along  with  die  knowledge  dial  sin(x)  -- 
sm(  IXO-x)  and  sin(-x)  =  -sin(x). 

Hie  procedure  for  filling  the  sine  and  antilog  tables  refect  the  compromises  previously  discussed.  To 
facilitate  describing  these  compromises,  the  actual  values  chosen  for  the  table  length  and  width  will  be  suited. 

I  he  log(sin(x)  table  is  2K  deep  and  12  bits  wide.  I  he  anlilog  Liblc  is  16  bits  wide  and  8K  deep,  which  means 
that  it  Inis  13  address  bits.  One  bit  is  used  for  the  parity  conversion. 

I  he  hardware  uses  two’s  t  implement  arithmetic,  so  the  parity  conversion  section  of  the  algorithm 
after  the  antilog  requires  an  operation  equivalent  to  a  full  add.  The  synlliesi/cr  has  no  special  purpose 
hardware  for  this.  It  combines  die  parity  and  log  operations  in  the  same  antilog  table.  I'lie  sign  of  the  sine 
argument  is  used  to  switch  from  one  table  half  to  another  when  using  the  synthesizer  for  sine  summation  or 
I'M  synthesis. 

I  be  one  Ini  taken  I'm  the  partly  operation  leaves  1 1  bits  of  aipumcnl  to  the  anlilog  table.  If  the  range 
of  the  log(sin(x))  table  fills  12  bits,  then  the  log(smtx))  wave  sweeps  the  nutting's  domain  each  cycle.  Ihe 
output  of  Ihe  anlilog  table  pioduces  a  sine  wave  that  has  a  16  Ini  maximum  amplitude.  If  any  quantity  is 
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added  to  the  log(sin(x))  wave,  the  input  to  the  aiming  table  overflows,  since  Die  log(smtx))  already  sweeps  the 
entire  domain  of  the  antilog  table.  If  the  log(stn(x)  table  output  were  to  sweep  over  II  lilts,  it  would  not 
overflow  the  antilog  table  so  long  as  the  log-amplitude  value  added  to  it  was  less  than  or  equal  to  4096  l  or  a 
log-amplitude  value  of  0  the  sine  wave  would  have  its  minimum  peak  to  peak  output  swing  and  for  a 
log-amplitude  value  of  4096.  it  would  have  its  maximum  amplitude  swing,  f  rom  tins  we  see  that  the 
amplitude  range  over  whieli  the  sine  wave  can  van  is  dependent  on  its  maximum  value  in  the  log  domain. 
1  ct  us  call  this  value -Maxi. og-.  Then  the  relationship  between  Maxi  og,  the  synthesi/er's  dynamic  range  and 
the  tables'  lengths  and  widths  are  of  interest.  They  will  be  shown  by  die  equations  described  in  the  next 
section. 

6.5  Killing  the  Tables 

If  we  call  the  maximum  peak  to  peak  amplitude  that  the  sine  wave  assumes  -MaxMag-.  and  the 
minimum  peak  to  peak  amplitude  that  the  sine  wave  assumes  -MmMag-.  then  it  follows  that  the  sine  wave's 
dynamic  range  -Range-  is  described  by 

Range  -  MaxMag/MinMag 

l  et's  define  -MaxA-  as  the  maximum  value  lh.it  a  quantity  can  achieve  in  die  log  domain.  Using  die 
quantities  defined  thus  far,  we  can  define  a  scaling  ratio  that  relates  the  range  of  values  obtainable  in  the  log 
domain  to  the  range  of  values  obtainable  in  the  antilog  domain.  I  el's  call  diis  value  -I  Scale-,  then 

I  Scale  (MaxA  -  Maxi  og)/log(Rangc) 

The  quantities  defined  thus  far  determine  the  values  of  the  entries  to  he  assigned  in  the  antilog  table. 
Ii  turns  out  that 

i  (h  antilog  entry  -  MinMng*exp[  (i  -  Maxi  og)/ 1  Seale  ] 

We  see  that  the  extreme  points  produced  by  this  equation  give  what  is  expected.  When  i  assumes  the 
value  Maxing,  die  exponent  equals  1,  and  so  MmMag  is  output.  When  i  assumes  the  value  MaxA,  the 
numerator  of  I  Scale  divides  nut,  and  (lie  quantity  log(  Range  )  is  cxpnncniialed.  This  produces  the  value 
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Kange.  a h:Ji  is  multiplied  by  MinMag  to  produce  the  value  MaxM.ig. 

It' we  define  tire  quantity 

I  'psilon  -  l/cxp|  Max  log/ 1. Scale  ] 

and 

Norm  =  1.14159/Maximum  log(sin(x))  argument 
then  it  turns  out  that 

i  th  log(sin(x))  entry  =  I  Scale*log[  sin(  Nonn*i)/Kpsilon) 

Norm  is  obviously  a  nomiali/ation  constant  that  maps  the  logtablc  arguments  into  die  range 
10.1.1415')).  I  psilon  is  another  log  domain  normalization  constant.  We  see  that  the  boundary  conditions  arc 
satisfied.  If  i  takes  on  the  value  of  the  argument  that  maximizes  the  log(sm(x))  function,  then  sin(Norm*i)  is 
I.  I  psilon  is  then  inverted,  which  yields  exp[  Maxi  og/l. Scale  ]  and  its  log  is  taken,  which  gives 
Maxi  og/l  Scale.  Max  I. og/l  Scale  is  multiplied  by  I  Scale,  which  yields  Maxl.og.  Ihc  other  boundary 
condition  is  where  sine  takes  on  its  minimum  value,  which  is  zero.  We  arc  then  obligated  to  take  the  log  of 
zero,  which  brings  us  to  our  next  section. 

Mi  flic  I  .og  of  Zero  or:  You  Can't  Get  There  from  Here 

I  hc  lipsilon  term  in  the  above  equations  guarantee  that  most  log(sin(x))  table  entries  will  be  greater 
than  zero.  Hut  there  tire  several  entries  near  the  zero  crossing  point  that  are  not  (About  52  entries  at  each  edge 
of  a  2K  table  with  a  Maxi  og  of  102,1  and  a  Range  of  2048).  It  was  thought  that  these  entries  could  simply  be 
set  to  zero,  but  the  resulting  waveform’s  zero  crossings  were  visibly  distorted  at  all  amplitudes,  so  the  problem 
had  to  be  corrected.  To  do  this,  first  the  log(sm(x))  entries  at  the  tables  edges  were  filled  with  their  two’s 
complement  negative  values  calculated  from  the  previously  discussed  equations.  The  log  of  zero  was  set  to 
the  maximum  negative  number  representable  in  the  table.  (All  numbers  were  sign  extended.)  Next,  an 
exclusive  or  gate  was  wired  to  the  sign  bit  of  the  log(sin(x))  tabic  output,  and  to  the  carry  output  of  the  adder 
performing  the  final  log  domain  addition.  The  exclusive  or  output  was  connected  to  the  "clear"  input  of  the 
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pipeline  register  following  the  antilog  table.  This  is  the  control  input  labeled  *8*  in  the  "l  og  Synthesizer 
Amplitude  Modulation  Section"  figure. 

The  logic  of  this  is  clear.  If  there  is  no  carry  and  the  sign  bit  is  zero,  then  this  is  a  normal  operation 
a  id  die  register  should  not  be  cleared.  If  there  is  a  carry  and  the  sign  bit  is  zero,  then  there  has  been  a  positive 
overflow.  It  is  assumed  that  the  processor  will  not  let  diis  occur  (unless  it  wishes  to  clip  the  signal).  If  there  is 
no  carry  and  the  sign  bit  is  one.  then  a  positive  number  (the  log-amplitude)  was  added  to  a  negative  number 
(the  log(sin(x))  value)  of  greater  absolute  value  and  produced  a  negative  result,  so  the  register  is  cleared, 
f  inally,  if  there  was  a  carry  and  the  sign  bit  is  1.  then  a  positive  number  was  added  to  a  negative  number  of 
lesser  absolute  value  and  produced  a  positive  result,  so  the  register  is  not  cleared. 

As  the  system  is  shown,  there  is  no  way  to  completely  shut  a  voice  off  The  reason  is  that  when  die 
oscillator  is  shut  off,  dicre  is  no  control  over  what  value  is  left  in  die  Phase  Memory,  so  some  constant  value  is 
always  being  produced  whenever  that  oscillator  nice  is  processed  by  the  log(sin(x))  table.  The  voice  can  be 
multiplied  h\  a  small  amplitude,  but  never  by  a  zero  one.  This  means  dial  the  constant  offset  is  always  passed 
through,  flic  end  effect  is  that  there  could  be  many  constant  value  offsets  floating  through  the  system,  each 
die  result  of  an  oscillator  stuck  in  some  non-zero  phase.  Ill''  way  this  problem  was  solved  was  to  use  a  special 
code  in  the  I  nvelope  Memory  to  designate  "0".  When  the  synthesizer  secs  that  code,  it  clears  the  antilog 
output  pipeline  register  *8*  control  for  diat  voice. 

6.7  How  Not  *o  Update  Memory 

As  mentioned  previously,  all  memory  is  interleaved.  This  not  only  increases  the  pipeline  speed,  it 
also  permited  a  solution  to  a  serious  design  (law  in  the  prototype.  To  understand  the  flaw,  one  must  know 
that  the  synthesizer  accesses  oscillators  sequentially,  that  is.  in  the  vine  summation  synthesis  algorithm,  it  first 
accesses  the  zeroth  voice,  then  the  lust,  then  the  second  and  so  on.  In  the  initial  design,  the  Coefficient, 
f nvelope.  and  Phase  Increment  memories  are  updated  hy  the  following  algoiitfm.  When  they  recievc  an 
upd  ile  v.ilue.  ih'-y  hold  it  until  the  location  it  is  destined  for  is  .missed  hy  the  synthesis  algorithm  and  (lien 
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thc>  w rite  ii  in.  The  problem  with  this  method  is  that  the  wait  time  for  a  single  voice  could  be  as  long  as  32 
clock  cycles.  If  the  controlling  processor  docs  not  wait  32  clock  cycles  before  sending  another  value.  it  stands 
the  chance  of  overwriting  data  it  had  previously  sent.  When  the  controlling  processor  was  updating  all  32 
voices  one  after  another,  it  would  take  32x32  =  1024  synthesizer  clock  cycles  to  update  all  the  registers.  This 
limits  the  sampling  rate  to  2  Mhz/1024  =  2  Khz  in  the  prototype  for  a  waveform  where  all  voices  arc  used 
and  a  guarantee  of  no  overwrites  is  required.  This  problem  is  had  in  the  prototype,  but  it  would  be 
catastrophic  in  a  25b  voice  synthesizer.  If  a  synthesizer  of  this  design  had  to  update  250  voices  with  a  100  ns 
pipeline  clock  and  required  no  overwrites,  then  it  would  be  bandlimited  to  envelopes  with  a  frequency 
content  of  no  greater  than  (10  Mil//  (256*256)  )/2  =  75  hz. 

I  lowever,  in  this  design,  all  memory  is  interleaved.  This  allows  one  to  construct  hardware  that  makes 
the  controlling  processor  wait  a  maximum  of  two  cycles  before  updating  memory.  This  is  better  than  using  a 
memory  access  algorithm  dial  needs  time  proportional  to  the  square  of  the  memory  length  in  order  to  update 
it. 

6.8  Interpolating  Table  Lookup 

Both  the  sine  and  anlilog  tables  could  have  used  interpolation  to  cut  down  on  their  lookup  memory 
table  sizes.  However,  interpolation  requires  both  a  multiplication  and  an  addition  to  perform.  The  following 
argument  demonstrates  how  much  memory  is  saved  using  linear  interpolation  to  lookup  the  value  of  a 
piecewise  continuous  function. 

Interpolation  is  usually  accomplished  in  hardware  by  using  a  multiplier,  an  adder  and  a  slope  or 
derivative  table  to  calculate  Interpolated  Value  =  Lookup  Value  +  1-Taction  *  Slope. 

The  question  .irises,  how  much  memory  is  saved  by  using  interpolation  in  the  table  lookup  of  any 
given  piecewise  linear  function?  This  is  rather  easily  calculated.  Since  any  given  function  might  swing  full 
scale,  the  most  significant  bit  of  the  slope  table  must  have  the  same  weight  as  the  most  significant  bit  of  the 
lookup  table.  Also,  since  we  presumably  want  to  preserve  piccision,  the  I -Taction  and  hence  the  Slope  values 
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must  be  of  the  same  length  as  the  I  ookup  Value. 

Long  Term  Power  Spectrum  studies  using  arbitrarily  large  prccision|2X)  have  indicated  that  if  one 
uses  the  interpolation  method  with  each  table  having  2**n  entries,  then  it  w  ill  take  a  table  of  2**2n  locations 
with  a  normal  table  lookup  to  get  as  good  a  signal.  Together  with  the  bit  significance  argument,  which  shows 
that  we  need  two  tables  of  the  same  length  in  order  to  interpolate,  interpolation  seems  to  save  space  by  a 
factor  of  2**(n-l)  (if  piecewise  linearity  can  be  assumed).  This  result  was  reported  by  |T2]  for  sine  functions 
but  not  for  piecewise  linear  functions. 

However,  to  do  interpolation,  it  is  necessary  to  perform  a  multiply  and  the  author  knows  of  no 
multiplier  of  12  bit  (or  greater)  precision  that  can  (practically )  operate  in  under  150ns. 

6.9  The  Curse  of  f  ixed  Point 

fixed  point  limits  the  algorithm  in  two  crucial  ways.  I  lie  lirst  is  in  the  antilog  table  lookup 
operation.  I  or  many  antilog  table  arguments,  1(1  bits  of  argument  are  compressed  into  4  bits  of  result.  (  Ihis 
points  out  the  interesting  fact  that  log  table  lookup  can  do  a  truncation  on  multiplication.)  I  he  sine  wave  is 
very  distorted  for  small  values,  which  is  to  be  expected.  The  way  to  get  around  this  would  be  to  use  floating 
point.  Floating  point  giv  es  one  fixed  signal  to  noise  ratio  across  the  entire  dynamic  range  of  the  output.  For  a 
fixed  word  si/e.  floating  point  does  not  give  as  good  a  signal  to  noise  ratio  for  high  amplitude  output  values  as 
does  fixed  point,  hot  it  does  produce  a  much  better  signal  at  low  amplitudes.  However,  if  the  anlilog 
operation  looked  up  a  value  to  be  fed  to  a  floating  point  DAC,  the  digital  mixing  would  be  more  complex  and 
there  could  be  no  feedback  path  from  the  mixer  to  the  FM  register. 

The  second  place  where  fixed  point  limits  the  machine's  operation  is  in  the  performance  of  the  FM 
synthesis  algorithm.  The  machine  must  choose  between  using  a  wide  range  of  slightly  vaiying  modulation 
indit  es  or  a  small  range  of  greatly  van  mg  indites.  I  his  is  due  in  part  to  the  lrunc.it ton  occ  tiring  in  the  antilog 
table  lookup. 
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(>.10  Conclusions 

The  design  is  adequate,  hut  it  has  several  disadvantages. 

I  he  first  and  most  important  is  that  it  is  a  hardware  implementation  of  a  specific  algorithm  as 
opposed  to  being  a  general  signal  processor.  This  is  dramatical!)  illustrated  by  the  fact  that  die  IM  synthesis 
microcode  can  only  produce  a  quarter  die  number  of  sine  summation  terms  (X  versus  .12).  One  would  initially 
expect  that  the  number  of  svnthesizable  KM  terms  would  equal  half  the  number  of  sine  summation  terms 
since  I  'M  uses  twice  as  many  sine  terms  as  does  sine  summation.  The  processor  has  the  memory  capacity  to 
synthesize  more  KM  terms,  it  just  has  no  fast  data  paths  leading  to  that  memory.  (As  more  memory  is 
multiplexed  in,  this  ratio  will  approach  1/3.) 

The  second  most  important  problem  is  that  updating  the  processor  amplitude  envelope  "by  band" 
limits  the  number  of  produceablc  sounds.  Many  sounds  require  that  the  envelope  information  be  updated 
just  as  fast  as  the  frequency  information.  This  is  clearly  an  impossible  task  in  the  present  design  for  a 
conventional  general  purpose  processor  if  the  number  of  envelopes  is  256.  "Rate  and  limit"  hardware  is  a 
necessary,  if  expensive,  addition  to  the  synthesi/er  if  such  high  bandwidth  envelopes  arc  desired.  Another 
solution  to  this  problem  would  be  to  place  diesc  high  frequency  harmonics  into  the  phase  registers.  But  then 
the  envelope  would  be  harder  to  model. 

Ilie  present  design  also  has  a  large  table  lookup  memory  cost  associated  with  it.  However,  this  is 
more  of  a  feature  than  a  problem.  The  amount  of  table  lookup  memory  can  be  dramatically  reduced  if  one 
wishes  to  only  use  one  set  of  optimized  lookup  functions. 

It  would  have  been  nicer  if  the  frequency  word  length  had  been  four  bits  wider  for  some  applications. 

The  use  of  log  arithmetic  was  successful  in  this  application.  It  produces  a  processor  that  can 
potentially  oulpeiform  any  other  in  its  price  class  in  (eims  of  speed. 
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A  Music  Synthesizer  System  Architectures 
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Inexpensive  Synthesizer 
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Inexpensive  Synthesizer-Square  Wave  Amplitude  Modulation  Section 
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Synthesis  Functions  Used  in  Music 


I  Walsh  Synthesis 

ai  ■  Wai  (bi,t) 

i=  1 

II  Sine  Summation  Synthesis 

32 

Ai  •  ai  »sin[(wi)(t)] 

i  =  1 

III  FM  Synthesis 

£  Ai  ■  ai  •  sin[  (wi)(t)  +  (Bi)(bi)sin(  (vi)(t) )] 

i=  1 

IV 

Ai  ■  ai  •  sin[  (wi)(t)  +  (Bi)(bi)sin(  (vi)(t) )  +  (Ci)(ci)sin(  (ui)(t) )] 
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Ai  ■  ai  Bsin[(wi)(t)  ♦  (Bi)(bi)sin{  (vi)(t)  +  (Ci)(ci)sin(  (ui)(t) ) }  ] 
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Walsh  Function  Synthesizer 
Walsh  Waveform  Generation  Section 


Multiplexed 


Walsh  Function  Synthesizer 
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